(serve.llm) Remove test leakage from placement bundle logic #53723

nrghosh · 2025-06-10T22:44:58Z

Fixing Test Code Leakage in LLM Placement Bundle Logic

Problem

Test Code Leakage: Production code had test-specific logic that didn't follow clean separation of concerns:

Tests specified accelerator_type: A10G in YAML configs, causing placement_bundles to return [{"GPU": 1, "accelerator_type:A10G": 0.001}]
In CI environments without actual GPU hardware, Ray couldn't schedule these resources
Workaround mock_resource: 0 check was embedded in production code to bypass this and set bundles = []
Test infrastructure was coupled to production logic, which is not ideal

Solution

Clean Test/Production Separation: Implemented proper test fixtures that isolate test behavior without modifying production code:

Test Fixture Approach (conftest.py):
- Created disable_placement_bundles fixture using unittest.mock.patch
- Patches VLLMEngineConfig.placement_bundles property to return [] during tests
- Automatically applied to test fixtures that need GPU-free execution
Subprocess Test Handling:
- Initial monkey-patch only worked for in-process pytest tests
- Added explicit CPU resource configuration for subprocess tests (serve run)
- Used resources_per_bundle: {"CPU": 1} for CPU-only testing scenarios
Production Code Cleanup:
- Removed mock_resource checks from server_models.py
- Eliminated test-specific environment variables that leaked into production
- Restored clean placement bundle logic for production use

Test Coverage

Fixed Test Categories:

CPU tests using mock engines (no longer hang waiting for GPU resources)
Subprocess tests using serve run (proper resource declarations)
GPU integration tests with mock engines (clean separation of mock vs real engines)

Key Tests Fixed:

test_build_openai_app_with_config (CPU subprocess test)
test_lora_unavailable_base_model (CPU mock test)
test_models (GPU integration test with mock engine)

Architecture

Before (Test Leakage):

# Production code contained test logic
if "mock_resource" in config:
    return []  # Test-specific behavior in prod!

After (Clean Separation):

# Test fixture handles test-specific behavior
@pytest.fixture
def disable_placement_bundles():
    with patch.object(VLLMEngineConfig, "placement_bundles", 
                     new_callable=lambda: property(lambda self: [])):
        yield

Approach Evolution

Earlier broad monkey-patching broke existing tests, which led to a more refined approach:

Use targeted resource declarations (resources_per_bundle: {"CPU": 1}) for CPU-only scenarios
Preserve all core placement bundle functionality for production
Keep backward compatibility

- Replace hacky fix to bypass placement groups when mock_resource was present - Remove coupling between test infra and prod code - Remove mock_resource checks, add test fixture to configure placement_bundles as [] when in testing - Tests now run without requiring GPUs by using a test-layer patch instead of modifying production logic - Addresses TODO (server_models.py) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

- Root Cause: Patch in conftest.py only worked within the pytest process, but the failing test (test_build_openai_app_with_config) spawned a subprocess using serve run, which didn't inherit the monkey patch - Monkey patch for in-process tests: Continues to work for regular pytest tests - Environment variable for subprocesses: RAYLLM_TEST_DISABLE_PLACEMENT_GROUPS=1 is set by the fixture and checked by the production code Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_models.py

- stop using env variable to control placement groups - monkey patching (global) broke tests by overriding all - tests ensure placement bundle logic works - eliminate leakage while still maintaining test coverage - fix tests by relaxing some assertions on serve_options dict Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

kouroshHakha

LGTM. waiting for the tests.

…resource needs. apply placement bundle override to overlooked tests, fix type errors Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

The refactoring to explicit placement group mocking missed some fixtures that use mock engines but still tried to create real placement groups. Changes: - Add disable_placement_bundles dependency to testing_model fixtures in conftest.py - Set accelerator_type: None in test_application_builders.py subprocess configs - Remove environment variable approach to avoid test code leakage This fixes hanging tests in both CPU and GPU test suites that use mock engines (MockVLLMEngine) but were attempting to create real GPU placement groups. Mock vs real engine tests need different placement group handling, not CPU vs GPU tests. Fixes: - test_build_openai_app_with_config (CPU subprocess test) - test_lora_unavailable_base_model (CPU mock test) - test_models (GPU integration test with mock engine) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

…t groups - fix broad initial approach that was breaking some tests - more targeted approach (explicitly declare test resources in config) - Revert core placement_bundles logic to preserve production behavior - Add resources_per_bundle: {"CPU": 1} to test fixture for CPU-only testing - Fixes test_build_openai_app_with_config deployment timeouts without breaking existing tests Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com>

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

eicherseiji · 2025-06-26T22:49:44Z

python/ray/llm/tests/serve/conftest.py

+
+
+@pytest.fixture
+def disable_placement_bundles():


Maybe in a future change--

To avoid adding disable_placement_bundles each to each test, could we instead

create a new conftest.py under cpu

move this fixture declaration there

and use @pytest.fixture(autouse=True)?

…ect#53723) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com>

nrghosh requested a review from a team as a code owner June 10, 2025 22:44

nrghosh added the go add ONLY when ready to merge, run all tests label Jun 10, 2025

nrghosh changed the title ~~(serve.llm) remove test leakage from placement bundle logic~~ (serve.llm) Remove test leakage from placement bundle logic Jun 10, 2025

nrghosh requested a review from kouroshHakha June 10, 2025 23:53

nrghosh force-pushed the fix-test_code_leakage branch 2 times, most recently from 8ee9b19 to b20ea10 Compare June 11, 2025 17:55

nrghosh requested a review from eicherseiji June 11, 2025 18:34

nrghosh force-pushed the fix-test_code_leakage branch from b20ea10 to 78a9359 Compare June 11, 2025 20:50

nrghosh marked this pull request as draft June 11, 2025 21:04

nrghosh removed the request for review from kouroshHakha June 11, 2025 21:05

eicherseiji reviewed Jun 11, 2025

View reviewed changes

python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_models.py Outdated Show resolved Hide resolved

nrghosh removed the go add ONLY when ready to merge, run all tests label Jun 14, 2025

nrghosh marked this pull request as ready for review June 18, 2025 19:58

nrghosh added 2 commits June 18, 2025 16:55

working update for tests + fix timeout

5ad480d

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

fix breaking test

652f5cf

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

kouroshHakha added the go add ONLY when ready to merge, run all tests label Jun 19, 2025

kouroshHakha approved these changes Jun 19, 2025

View reviewed changes

kouroshHakha enabled auto-merge (squash) June 19, 2025 18:47

ensure serve run subprocs and in-process api calls both mock out gpu …

0036e33

…resource needs. apply placement bundle override to overlooked tests, fix type errors Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

github-actions bot disabled auto-merge June 20, 2025 01:19

nrghosh added 2 commits June 23, 2025 12:55

nrghosh force-pushed the fix-test_code_leakage branch from b032aef to 2f67dc1 Compare June 24, 2025 01:41

nrghosh and others added 3 commits June 23, 2025 23:50

Merge branch 'master' into fix-test_code_leakage

29d4e1c

Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com>

lint

3114972

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>

Merge branch 'master' into fix-test_code_leakage

55fb4be

eicherseiji reviewed Jun 26, 2025

View reviewed changes

eicherseiji self-requested a review June 26, 2025 22:51

eicherseiji approved these changes Jun 26, 2025

View reviewed changes

kouroshHakha approved these changes Jun 27, 2025

View reviewed changes

kouroshHakha merged commit 2140faa into ray-project:master Jun 27, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(serve.llm) Remove test leakage from placement bundle logic #53723

(serve.llm) Remove test leakage from placement bundle logic #53723

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

(serve.llm) Remove test leakage from placement bundle logic #53723

(serve.llm) Remove test leakage from placement bundle logic #53723

Uh oh!

Conversation

Uh oh!

Fixing Test Code Leakage in LLM Placement Bundle Logic

Problem

Solution

Test Coverage

Architecture

Approach Evolution

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!