8000 perf: update sft and dpo recipes to use bf16 by ashors1 · Pull Request #302 · NVIDIA-NeMo/RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

perf: update sft and dpo recipes to use bf16 #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 5, 2025
Merged

Conversation

ashors1
Copy link
Contributor
@ashors1 ashors1 commented May 1, 2025

What does this PR do ?

SFT FP32 vs BF16:
bf16_sft

DPO FP32 vs BF16:
bf16_dpo

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...
8000

Signed-off-by: ashors1 <ashors@nvidia.com>
@ashors1 ashors1 requested review from yfw and terrykong May 1, 2025 16:47
yfw
yfw previously approved these changes May 1, 2025
@terrykong
Copy link
Contributor

@ashors1 could you share the convergence curves just for posterity

@ashors1
Copy link
Contributor Author
ashors1 commented May 5, 2025

@ashors1 could you share the convergence curves just for posterity

Updated the PR description

@terrykong terrykong added this pull request to the merge queue May 5, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 5, 2025
@terrykong
Copy link
Contributor

@ashors1 there was a test failure https://github.com/NVIDIA/NeMo-RL/actions/runs/14841614037/job/41665718613

the functional sft test fails b/c the numerics have changed

@ashors1
Copy link
Contributor Author
ashors1 commented May 5, 2025

the functional sft test fails b/c the numerics have changed

Yep, looking into it now

Signed-off-by: ashors1 <ashors@nvidia.com>
@ashors1
Copy link
Contributor Author
ashors1 commented May 5, 2025

@terrykong updated the test and manually verified that the increased loss is not a concern, as over time the curves are comparable

@terrykong terrykong added this pull request to the merge queue May 5, 2025
Merged via the queue into main with commit 3f0c7d1 May 5, 2025
13 checks passed
@terrykong terrykong deleted the ashors/bf16-default branch May 5, 2025 22:33
YzjiaoNvd pushed a commit to YzjiaoNvd/NeMo-RL that referenced this pull request Jun 10, 2025
Signed-off-by: ashors1 <ashors@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0