-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[serve] split call_user_method #54104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
32bc5ae
to
f5542e0
Compare
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR splits the single call_user_method into three distinct functions to properly handle HTTP entrypoints, streaming generators, and non-streaming user methods.
- Updated test cases to call call_http_entrypoint and call_user_generator instead of call_user_method for streaming or HTTP requests.
- Modified ReplicaBase internals to conditionally call the new methods based on request metadata.
- Adjusted local testing mode to support the new call_user_generator and call_http_entrypoint functions.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
File | Description |
---|---|
python/ray/serve/tests/unit/test_user_callable_wrapper.py | Updated test invocations to use the appropriate call_ functions. |
python/ray/serve/_private/replica.py | Splits the call_user_method logic into call_http_entrypoint, call_user_generator, and call_user_method. |
python/ray/serve/_private/local_testing_mode.py | Adjusted handling of streaming requests to call the correct function. |
Comments suppressed due to low confidence (1)
python/ray/serve/_private/replica.py:609
- [nitpick] The variable 'call_user_method_future' now holds the result from either call_http_entrypoint or call_user_generator. Consider renaming it (e.g., 'call_user_future') to better reflect its usage.
)
the refactor makes sense. left some nits. |
python/ray/serve/_private/replica.py
Outdated
*, | ||
generator_result_callback: Optional[Callable] = None, | ||
) -> Any: | ||
"""Call a user method (unary or generator). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this method handle generator case for GRPC / deployment_handle calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
## Why are these changes needed? Split `UserCallableWrapper.call_user_method` into three functions: 1. `call_http_entrypoint` - processes all requests with `request_metadata.is_http_request == True` 2. `call_user_generator` - processes all requests with `request_metadata.is_http_request == False and request_metadata.is_streaming == True` 3. `call_user_method` - processes all requests with `request_metadata.is_streaming == False` Note that - `ReplicaBase.handle_request` directly calls `call_user_method` - `ReplicaBase.handle_request_with_streaming` calls `ReplicaBase._call_streaming` which calls `call_http_entrypoint` if `request_metadata.is_http_request == True` otherwise calls `call_user_generator` - `ReplicaBase.handle_request_with_rejection` calls `call_user_method` if `request_metadata.is_streaming == False` otherwise calls `ReplicaBase._call_streaming`. --------- Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Why are these changes needed?
Split
UserCallableWrapper.call_user_method
into three functions:call_http_entrypoint
- processes all requests withrequest_metadata.is_http_request == True
call_user_generator
- processes all requests withrequest_metadata.is_http_request == False and request_metadata.is_streaming == True
call_user_method
- processes all requests withrequest_metadata.is_streaming == False
Note that
ReplicaBase.handle_request
directly callscall_user_method
ReplicaBase.handle_request_with_streaming
callsReplicaBase._call_streaming
which callscall_http_entrypoint
ifrequest_metadata.is_http_request == True
otherwise callscall_user_generator
ReplicaBase.handle_request_with_rejection
callscall_user_method
ifrequest_metadata.is_streaming == False
otherwise callsReplicaBase._call_streaming
.