8000 Support resizing placement groups · Issue #16403 · ray-project/ray · GitHub 8000
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Support resizing placement groups #16403
Open
@clay4megtr

Description

@clay4megtr

Describe your feature request

For feature requests or questions, post on our Discussion page instead: https://discuss.ray.io/

Hello guys, I want to add the resizing feature for the placement group as we talked about before, This API would like this:

def test_placement_group_add_bundles_api_basic(ray_start_cluster):
    @ray.remote(num_cpus=2)
    class Actor(object):
        def __init__(self):
            self.n = 0

        def value(self):
            return self.n

    cluster = ray_start_cluster
    cluster.add_node(num_cpus=4)
    ray.init(address=cluster.address)

    # Create a infeasible placement group first.
    infeasible_placement_group = ray.util.placement_group(
        name="name", strategy="PACK", bundles=[{
            "CPU": 8
        }])
    assert not infeasible_placement_group.wait(4)
    # Make sure the add bundles request will fail since it is pending now.
    with pytest.raises(
            ray.exceptions.RaySystemError,
            match="the placement group is in scheduling now"):
        infeasible_placement_group.add_bundles([{"CPU": 2, "memory": 50 * MB}])

    # Remove the infeasible placement group.
    ray.util.remove_placement_group(infeasible_placement_group)

    def is_placement_group_removed():
        table = ray.util.placement_group_table(infeasible_placement_group)
        if "state" not in table:
            return False
        return table["state"] == "REMOVED"

    wait_for_condition(is_placement_group_removed)

    # Create a feasible placement group now.
    placement_group = ray.util.placement_group(
        name="name", strategy="PACK", bundles=[{
            "CPU": 2
        }])

    # Wait for the placement group to create successfully.
    assert placement_group.wait(5)

    placement_group.add_bundles([{"CPU": 2,}])
    table = ray.util.placement_group_table(placement_group)
    assert len(list(table["bundles"].values())) == 2
    assert table["state"] == "CREATED"

    # Wait for the add new bundles operation to finish.
    assert placement_group.wait(5)

    # Schedule an actor through the new bundle index.
    actor = Actor.options(
        placement_group=placement_group,
        placement_group_bundle_index=1).remote()

    ray.get(actor.value.remote())

BTW, you can also replace the wait(timeout) API with the ready() API if you want.

Also, the related proposal document is here: API proposal

Please leave comments if you have any requirements or questions, Thx!

Metadata

Metadata

Labels

P2Important issue, but not time-criticalcore-placement-groupenhancementRequest for new feature and/or capabilitypending-cleanupThis issue is pending cleanup. It will be removed in 2 weeks after being assigned.

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0