[Serve.llm][P/D] Fix health check in prefill disagg #53937

kouroshHakha · 2025-06-18T22:49:23Z

Also needs vllm-project/vllm#19821 to be merged.

Needs testing: https://buildkite.com/ray-project/release/builds/46098

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha · 2025-06-18T22:50:34Z

python/ray/llm/_internal/serve/deployments/llm/llm_server.py

@@ -466,25 +469,10 @@ async def __init__(

        self.response_postprocessor = ResponsePostprocessor()

-    @property


This is useless. I don't know why we had it. removing.

kouroshHakha · 2025-06-18T22:53:19Z

python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_engine.py

@@ -816,9 +816,9 @@ async def check_health(self) -> None:
            raise RuntimeError(f"{type(self.engine)} does not support health check.")

        try:
-            return await asyncio.wait_for(self.engine.check_health(), timeout=15)
+            await asyncio.wait_for(self.engine.check_health(), timeout=15)


removed the timeout time since ray serve has an adjustable timeout per deployment anyways.

kouroshHakha · 2025-06-18T22:53:59Z

python/ray/llm/_internal/serve/deployments/prefill_decode_disagg/prefill_decode_disagg.py

@@ -160,13 +151,6 @@ async def _predict(
        ):
            yield chunk

-    async def check_health(self) -> None:


These must be removed. In general the health check of a deployment is not bounded to the health check of its child deployments.

cc @lk-chen fyi

kouroshHakha · 2025-06-18T22:54:08Z

python/ray/llm/_internal/serve/deployments/routers/router.py

@@ -232,12 +232,6 @@ async def _setup_handle_and_config_maps(

    async def check_health(self):
        await self._init_completed.wait()
-        await asyncio.gather(


same thing applies here.

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha · 2025-06-19T02:49:50Z

release/llm_tests/serve/test_llm_serve_integration.py

@@ -27,6 +27,7 @@ async def test_engine_metrics():
        model="Qwen/Qwen2.5-0.5B-Instruct",
        dtype="auto",
        disable_log_stats=False,
+        enforce_eager=True,


Making the tests faster

python/ray/llm/_internal/serve/deployments/llm/llm_server.py

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha · 2025-06-20T21:24:40Z

python/ray/llm/tests/serve/cpu/deployments/routers/test_router.py

@@ -170,8 +170,6 @@ async def test_check_health(self, llm_config: LLMConfig):

        await router.check_health()

-        assert server.check_health.remote.call_count == 1


testing router's health check has nothing to do with server's health check.

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

ece1265

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha commented Jun 18, 2025

View reviewed changes

kouroshHakha added 3 commits June 18, 2025 15:54

wip

ba3eeb3

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

efe6f92

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

7c1a5f0

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha commented Jun 19, 2025

View reviewed changes

kouroshHakha marked this pull request as ready for review June 19, 2025 02:50

kouroshHakha requested a review from a team as a code owner June 19, 2025 02:50

kouroshHakha requested a review from eicherseiji June 19, 2025 02:52

eicherseiji approved these changes Jun 19, 2025

View reviewed changes

python/ray/llm/_internal/serve/deployments/llm/llm_server.py Show resolved Hide resolved

eicherseiji added the go add ONLY when ready to merge, run all tests label Jun 19, 2025

kouroshHakha added 3 commits June 19, 2025 22:13

wip

a58e780

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

d5576a5

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

40ab6af

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha commented Jun 20, 2025

View reviewed changes

kouroshHakha mentioned this pull request Jun 20, 2025

[Serve.llm] Remove ImageRetriever class and related tests from the LLM deployment module. #53980

Merged

kouroshHakha merged commit 55b8ce9 into ray-project:master Jun 22, 2025
5 checks passed

minerharry pushed a commit to minerharry/ray that referenced this pull request Jun 27, 2025

[Serve.llm][P/D] Fix health check in prefill disagg (ray-project#53937)

cb20f16

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Serve.llm][P/D] Fix health check in prefill disagg #53937

[Serve.llm][P/D] Fix health check in prefill disagg #53937

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		@@ -466,25 +469,10 @@ async def __init__(

		self.response_postprocessor = ResponsePostprocessor()

		@property

		@@ -170,8 +170,6 @@ async def test_check_health(self, llm_config: LLMConfig):

		await router.check_health()

		assert server.check_health.remote.call_count == 1

[Serve.llm][P/D] Fix health check in prefill disagg #53937

[Serve.llm][P/D] Fix health check in prefill disagg #53937

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!