8000 [Jobs] Include requested and available resources in JobInfo status message · Issue #29921 · ray-project/ray · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
[Jobs] Include requested and available resources in JobInfo status message #29921
Open
@architkulkarni

Description

@architkulkarni

Description

Currently when a job is PENDING, the JobStatus message doesn't provide any more details other than "the job may be waiting for resources" (see https://github.com/ray-project/ray/pull/28654/files/7ccabf2b606f325f1d4e793421e5651e86a9885e#r1010817973)

Ideally, the status would include the requested resources for the job and the available resources for the job, similar to how this is achieved for Ray Serve replicas here:

f"resources available: {available}."

We might need some kind of wrapper for the JobSupervisor actor that periodically submits updates to the JobStatus, because the available resources are expected to change (for example, as the cluster scales up and adds nodes.)

Use case

When a job with num_cpus or num_gpus specified is PENDING for a while, there's no way for the user to find out exactly why it's pending without looking at the logs or checking ray status for the details about the internal JobSupervisor actor. This information should ideally be available in the JobStatus message for the job itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Important issue, but not time-criticalenhancementRequest for new feature and/or capabilitypending-cleanupThis issue is pending cleanup. It will be removed in 2 weeks after being assigned.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0