8000 feat(udf): introduce async and batched JavaScript UDF by fuyufjh · Pull Request #20403 · risingwavelabs/risingwave · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat(udf): introduce async and batched JavaScript UDF #20403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Mar 4, 2025
Merged

Conversation

fuyufjh
Copy link
Collaborator
@fuyufjh fuyufjh commented Feb 6, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR adds support for async and batched JavaScript UDF as well as fetch() API. Resolves #20391.

Syntax:

create function my_udf(s varchar) returns decimal language javascript async as $$
export async function my_udf(s) {
    ... // able to call async functions such as `await fetch()` here
}
$$ WITH (
   async = [true|false],
   batch = [true|false],
)

Notice that UDAF is not supported in this PR. You can confirm this by checking the code of frontend handle_create_aggregate() doesn't support the new syntax. However, the internal implementation is mostly ready except here.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

  • My PR needs documentation updates.
Release note

@fuyufjh fuyufjh requested a review from a team as a code owner February 6, 2025 08:54
@fuyufjh fuyufjh requested review from xiangjinwu and removed request for a team February 6, 2025 08:54
Cargo.toml Outdated
@@ -164,7 +164,7 @@ iceberg-catalog-glue = { git = "https://github.com/risingwavelabs/iceberg-rust.g
opendal = "0.49"
# used only by arrow-udf-flight
arrow-flight = "53"
arrow-udf-js = "0.5"
arrow-udf-js = { git = "https://github.com/arrow-udf/arrow-udf.git", rev = "4562678944210da2b3a094a6383f7b9ebe6cc54c" }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait for the release of arrow-udf-js before this PR merges.

Comment on lines 45 to 49
let state = self.runtime.call_agg_create_state()?;
Ok(AggregateState::Any(Box::new(State(state))))
// FIXME(eric): This is bad. Let's make `create_state` async later
futures::executor::block_on(async {
let state = self.runtime.call_agg_create_state().await?;
Ok(AggregateState::Any(Box::new(State(state))))
8000 })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't support async function create_state for now, what about using now_or_never instead of block_on here?

@fuyufjh fuyufjh marked this pull request as draft February 10, 2025 12:46
Base automatically changed from eric/js_function_def to main February 10, 2025 13:28
@fuyufjh fuyufjh marked this pull request as ready for review February 12, 2025 07:06
Eric Fu and others added 5 commits February 25, 2025 12:38
Signed-off-by: Richard Chien <stdrc@outlook.com>
Signed-off-by: Richard Chien <stdrc@outlook.com>
Signed-off-by: Richard Chien <stdrc@outlook.com>
@stdrc stdrc force-pushed the eric/async_js_udf branch from ba07063 to 0f417d9 Compare February 28, 2025 08:56
stdrc added 2 commits March 3, 2025 13:48
Signed-off-by: Richard Chien <stdrc@outlook.com>
@stdrc stdrc requested a review from MrCroxx March 3, 2025 05:50
stdrc added 2 commits March 3, 2025 14:44
Signed-off-by: Richard Chien <stdrc@outlook.com>
Signed-off-by: Richard Chien <stdrc@outlook.com>
risedev --allow-private link-all-in-one-binaries
risedev link-all-in-one-binaries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work?

Copy link
Member
@stdrc stdrc Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to avoid a warning complaining duplicated --allow-private, because risedev already mean cargo make --allow-private

Comment on lines +3290 to +3292
if let Some(v) = self.always_retry_on_network_error {
options.push(format!("always_retry_on_network_error = {}", v));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiangjinwu This unparsing bug will finally be addressed. 😄

SqlOptionValue::Value(Value::Boolean(true))
))
}
"batch" => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we name it batched instead? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use batch=True in external python udf

@stdrc stdrc enabled auto-merge March 3, 2025 15:26
@stdrc stdrc added the user-facing-changes Contains changes that are visible to users label Mar 3, 2025
Copy link
Contributor
github-actions bot commented Mar 3, 2025

Hi, there.

📝 Telemetry Reminder:
If you're implementing this feature, please consider adding telemetry metrics to track its usage. This helps us understand how the feature is being used and improve it further.
You can find the function report_event of telemetry reporting in the following files. Feel free to ask questions if you need any guidance!

  • src/frontend/src/telemetry.rs
  • src/meta/src/telemetry.rs
  • src/stream/src/telemetry.rs
  • src/storage/compactor/src/telemetry.rs
    Or calling report_event_common (src/common/telemetry_event/src/lib.rs) as if finding it hard to implement.
    ✨ Thank you for your contribution to RisingWave! ✨

This is an automated comment created by the peaceiris/actions-label-commenter. Responding to the bot or mentioning it won't have any effect.

@kwannoel kwannoel force-pushed the eric/async_js_udf branch from c9d97c7 to faacf05 Compare March 4, 2025 04:22
@stdrc stdrc added this pull request to the merge queue Mar 4, 2025
Merged via the queue into main with commit 0cc54b3 Mar 4, 2025
33 of 35 checks passed
@stdrc stdrc deleted the eric/async_js_udf branch March 4, 2025 05:03
stdrc added a commit that referenced this pull request Mar 4, 2025
Signed-off-by: Richard Chien <stdrc@outlook.com>
Co-authored-by: Richard Chien <stdrc@outlook.com>
Co-authored-by: Noel Kwan <noelkwan1998@gmail.com>
github-merge-queue bot pushed a commit that referenced this pull request Mar 4, 2025
…0403) to 2.3 (#20702)

Signed-off-by: Richard Chien <stdrc@outlook.com>
Co-authored-by: Eric Fu <eric@risingwave-labs.com>
Co-authored-by: Noel Kwan <noelkwan1998@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/run-backwards-compat-tests Run backwards compatibility tests in your PR. ci/run-e2e-single-node-tests ci/run-e2e-test-other-backends type/feature Type: New feature. user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Async JavaScript UDF and Fetch API
5 participants
0