8000 Ray core: incorrect account of GPUs on ec2 ubuntu instance: g4dn.2xlarge · Issue #29420 · ray-project/ray · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Ray core: incorrect account of GPUs on ec2 ubuntu instance: g4dn.2xlarge #29420
Open
@shahsmit1

Description

@shahsmit1

What happened + What you expected to happen

ray.cluster_resources() returns 2 GPUs as ray cluster resources on an ec2 instance with exactly 1 gpu. This is on a docker based setup with 2 containers: ray-head and ray-worker. The cluster is booted manually using ray_scripts.start.

Versions / Dependencies

OS: amazon/Deep Learning AMI (Ubuntu 18.04) Version 56.1
Ray: 1.13.0
Screen Shot 2022-10-14 at 10 44 56 AM

Reproduction script

current output:

>>> ray.cluster_resources()
{'GPU': 2.0, 'node:172.18.0.9': 1.0, 'object_store_memory': 19601575526.0, 'memory': 42487119054.0, 'accelerator_type:T4': 2.0, 'CPU': 8.0, 'allows_expensive': 4.0, 'node:172.18.0.5': 1.0}

expected output

>>> ray.cluster_resources()
{'GPU': 1.0, 'node:172.18.0.9': 1.0, 'object_store_memory': 19601575526.0, 'memory': 42487119054.0, 'accelerator_type:T4': 2.0, 'CPU': 8.0, 'allows_expensive': 4.0, 'node:172.18.0.5': 1.0}

Issue Severity

Medium: It is a significant difficulty but I can work around it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Important issue, but not time-criticalbugSomething that is supposed to be working; but isn'tcoreIssues that should be addressed in Ray Corecore-apipending-cleanupThis issue is pending cleanup. It will be removed in 2 weeks after being assigned.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0