8000 fix: Mcore: remove explicit refit buffer sizing and added functional grpo test by SahilJain314 · Pull Request #527 · NVIDIA-NeMo/RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix: Mcore: remove explicit refit buffer sizing and added functional grpo test #527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

SahilJain314
Copy link
Contributor

bring mcore training up to dtensor with no explicit refit buffer size.

Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
@github-actions github-actions bot added the CI Relating to CI label Jun 17, 2025
@SahilJain314 SahilJain314 changed the title fix: Remove explicit refit buffer sizing for megatron<-> vllm refit fix: Remove explicit refit buffer sizing for megatron<-> vllm refit and added functional MCore grpo test Jun 17, 2025
@SahilJain314 SahilJain314 changed the title fix: Remove explicit refit buffer sizing for megatron<-> vllm refit and added functional MCore grpo test fix: Mcore: remove explicit refit buffer sizing and added functional grpo test Jun 17, 2025
@SahilJain314 SahilJain314 requested a review from parthchadha June 17, 2025 23:25
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jun 17, 2025
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
@SahilJain314 SahilJain314 requested a review from parthchadha June 17, 2025 23:45
parthchadha
parthchadha previously approved these changes Jun 17, 2025
Base automatically changed from sahilj/mypy2 to main June 18, 2025 08:21
@terrykong terrykong dismissed parthchadha’s stale review June 18, 2025 08:21

The base branch was changed.

Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
@SahilJain314 SahilJain314 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jun 26, 2025
@SahilJain314 SahilJain314 requested a review from parthchadha June 26, 2025 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Relating to CI documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0