8000 Allow uneven shards for multi-GPU inference in vllm backend by KiddoZhu · Pull Request #494 · NVIDIA-NeMo/RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Allow uneven shards for multi-GPU inference in vllm backend #494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 27, 2025

Conversation

KiddoZhu
Copy link
Contributor
@KiddoZhu KiddoZhu commented Jun 9, 2025

The current implementation of generate_text in vllm backend requires every batch to be sharded evenly across multiple GPUs. This prevents one from running evaluation on dataset that is not divisible by the number of GPUs.

Signed-off-by: KiddoZhu <zhaochengz@nvidia.com>
@KiddoZhu KiddoZhu requested a review from terrykong June 9, 2025 23:39
SahilJain314
SahilJain314 previously approved these changes Jun 26, 2025
Copy link
Contributor
@SahilJain314 SahilJain314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks

Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
@SahilJain314 SahilJain314 enabled auto-merge June 26, 2025 22:08
@SahilJain314 SahilJain314 added this pull request to the merge queue Jun 26, 2025
@yuki-666 yuki-666 linked an issue Jun 27, 2025 that may be closed by this pull request
Merged via the queue into main with commit 62dbd9f Jun 27, 2025
11 of 14 checks passed
@SahilJain314 SahilJain314 deleted the zhaochengz/multi-gpu_eval branch June 27, 2025 13:32
xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jun 27, 2025
…eMo#494)

Signed-off-by: KiddoZhu <zhaochengz@nvidia.com>
Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
xxman-google pushed a commit to xxman-google/NeMo-RL that referenced this pull request Jun 27, 2025
…eMo#494)

Signed-off-by: KiddoZhu <zhaochengz@nvidia.com>
Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support un-even dispatches
2 participants
0