Pass parameters to custom routers through LLMConfig #53870

eicherseiji · 2025-06-17T00:54:41Z

Why are these changes needed?

Follow ups to #52725.

Allow custom router kwargs in LLMConfig/DeploymentConfig

We also discussed passing kwargs directly to the derived class' __init__, but concerned that this may lead to typos getting swallowed by kwargs in RequestRouter.init. Instead, initialize_state without kwargs can throw a TypeError, e.g.:

class RequestRouter:

	def __init__(self, ..., request_initizer_config: dict)

		self.initialize_state(**request_initizer_config)

	def initialize_state(self, **kwargs):
		pass


class MyRouter(RequestRouter):

	def initialize_state(self, threshold: float = 0.1):
		...

	def choose_replica(self, ...):
		...

# Result is TypeError: initialize_state() got an unexpected keyword argument 'thresh'
@serve.deployment(cls=MyRouter, cls_kwargs={"thresh": 0.2})
class Deploy:
	pass

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

eicherseiji · 2025-06-18T18:46:41Z

kwargs can be passed to a custom router class like so:

from ray import serve
from ray.serve.llm import LLMConfig, build_openai_app
from ray.serve._private.request_router.prefix_aware_router import PrefixAwarePow2ReplicaRouter

llm_config = LLMConfig(
    model_loading_config=dict(
        model_id="deepseek",
        model_source="qwen/Qwen2.5-7B-Instruct",
    ),
    runtime_env=dict(
        env_vars={"VLLM_USE_V1": "1"}
    ),
    deployment_config=dict(
        autoscaling_config=dict(min_replicas=1, max_replicas=1),
        request_router_class=PrefixAwarePow2ReplicaRouter,
        request_router_kwargs=dict(
            imbalanced_threshold=9,
        )
    ),
    engine_kwargs=dict(
        tensor_parallel_size=2,
        pipeline_parallel_size=2,
        gpu_memory_utilization=0.92,
        dtype="auto",
        max_num_seqs=40,
        max_model_len=16384,
        enable_chunked_prefill=True,
        enable_prefix_caching=True,
        trust_remote_code=True,
    ),
)

app = build_openai_app({"llm_configs": [llm_config]})
serve.run(app, blocking=True)

Copilot

Pull Request Overview

This PR enables passing custom keyword arguments (request_router_kwargs) through the Serve configuration to user‐provided request router classes. Additionally, it refactors the prefix‐aware router’s eviction loop from asyncio tasks to a background thread and updates related tests.

Add request_router_kwargs field in protobuf, config model, deployment API, and router
Wire serialization/deserialization of request_router_kwargs
Refactor eviction loop in prefix_tree.py from asyncio to threading
Update tests to call the router’s private selection method

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/ray/protobuf/serve.proto	Add new `bytes request_router_kwargs` field to `DeploymentConfig`
python/ray/serve/deployment.py	Expose `request_router_kwargs` in `options()` and apply to config
python/ray/serve/_private/router.py	Forward `request_router_kwargs` to router constructor and store it
python/ray/serve/_private/config.py	Define `request_router_kwargs`, validate JSON, and handle proto I/O
python/ray/llm/tests/serve/cpu/deployments/test_prefix_aware_request_router.py	Replace public call with private `_choose_replica_for_request`
python/ray/llm/_internal/serve/request_router/prefix_aware/prefix_tree.py	Convert eviction loop from `asyncio` to a background `threading.Thread`

Comments suppressed due to low confidence (2)

python/ray/serve/deployment.py:241

[nitpick] Inserting a new parameter into the middle of the options() signature can break callers using positional arguments. Consider making it keyword-only or placing it at the end with a default.

        request_router_kwargs: Default[Union[Dict, None]] = DEFAULT.VALUE,

python/ray/llm/tests/serve/cpu/deployments/test_prefix_aware_request_router.py:127

[nitpick] The test now calls a private method (_choose_replica_for_request) instead of the public API (choose_replica_for_request). It's better to test via the public interface to avoid coupling to internal implementation.

            chosen = await prefix_request_router._choose_replica_for_request(req)

python/ray/serve/_private/config.py

python/ray/llm/_internal/serve/request_router/prefix_aware/prefix_tree.py

python/ray/serve/_private/router.py

eicherseiji · 2025-06-18T23:28:14Z

Hi @kouroshHakha! This is ready for your review.

src/ray/protobuf/serve.proto

kouroshHakha

Two points:

Let's separate out serve only changes from the eviction thread changes and review the serve changes with serve team
Let's talk about the request router kwargs. The original intention of the design was to not expose the complexity of the constructor of the RequestRouter to the user. Right now the request_router_kwargs are passed through to the constructor which inflates the other kwargs that were supposed to stay hidden. Here is my proposal:

Modify the RequestRouter's constructor and interface this way:

class RequestRouter: 
        def __init__(self, ..., custom_init_kwargs=...)
              ...
              self.init(**custom_init_kwargs)

         def init(**kwargs): 
              # custom initialization for the Request Router. Called after the base constructor __init__ is done.

This way when I inherit this class I can simply do:

class MyRouter(RequestRouter)
        
        def init(self, param1=None)
               self.param1 = param1

        def choose_replica(...): 
              # create a policy based on self.param1


@serve.deployment(request_router_class=MyRouter, request_router_init_kwargs={"param1": 10})
class MyDeployment:
    ....

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

…ng user typos Signed-off-by: Seiji Eicher <seiji@anyscale.com>

python/ray/serve/_private/request_router/request_router.py

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

abrarsheikh · 2025-06-26T23:22:37Z

python/ray/serve/_private/request_router/request_router.py

+        Called at the end of RequestRouter.__init__ with request_router_kwargs
+        from DeploymentConfig .


Comment needs to be updated?

abrarsheikh · 2025-06-27T04:59:53Z

python/ray/serve/config.py

+        default=DEFAULT_REQUEST_ROUTER_PATH
+    )
+    # Keyword arguments that will be passed to the
+    # request router class __init__ method.


this comment needs to be updated?

abrarsheikh · 2025-06-27T05:02:39Z

src/ray/protobuf/serve.proto

+  // Timeout after which a replica started a record routing stats without a response.
+  double request_routing_stats_timeout_s = 4;
+
+  // kwargs which will be passed to the router class' __init__ method


same comment here

eicherseiji added the go add ONLY when ready to merge, run all tests label Jun 18, 2025

eicherseiji self-assigned this Jun 18, 2025

eicherseiji marked this pull request as ready for review June 18, 2025 18:48

Copilot AI review requested due to automatic review settings June 18, 2025 18:48

eicherseiji requested review from a team as code owners June 18, 2025 18:48

Copilot AI reviewed Jun 18, 2025

View reviewed changes

eicherseiji force-pushed the prefix-router branch from 9e660d4 to 58174ac Compare June 18, 2025 23:26

kouroshHakha reviewed Jun 19, 2025

View reviewed changes

src/ray/protobuf/serve.proto Outdated Show resolved Hide resolved

kouroshHakha reviewed Jun 19, 2025

View reviewed changes

eicherseiji and others added 10 commits June 19, 2025 13:44

Pass parameters to custom routers through LLMConfig

be8a871

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Convert eviction async task to thread

267e4d1

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Add request_router_kwargs to protobuf

267d927

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Remove unnecessary lock from eviction loop

c0b28fe

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Add request_router_kwargs to deployment options

286fe16

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Apply suggestions from code review

7e603f1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>

Address code review

36b6c2f

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Update api docs

bde1379

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Add comment to Protobuf

d25f19d

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Remove prefix tree changes from this PR

98e11ef

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji force-pushed the prefix-router branch from 768de7c to 98e11ef Compare June 19, 2025 20:44

Create initialize_state() to avoid **kwargs in RequestRouter swallowi…

23f661c

…ng user typos Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji commented Jun 25, 2025

View reviewed changes

python/ray/serve/_private/request_router/request_router.py Outdated Show resolved Hide resolved

eicherseiji added 4 commits June 25, 2025 13:42

Create RouterConfig

6d5f51f

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Remove excess whitespace from serve.proto

d88d551

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Fix java files

f04a583

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Pickle/unpickle request_router_kwargs

4f7bec8

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

abrarsheikh reviewed Jun 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pass parameters to custom routers through LLMConfig #53870

Pass parameters to custom routers through LLMConfig #53870

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		Called at the end of RequestRouter.__init__ with request_router_kwargs
		from DeploymentConfig .

Pass parameters to custom routers through LLMConfig #53870

Are you sure you want to change the base?

Pass parameters to custom routers through LLMConfig #53870

Uh oh!

Conversation

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!