8000 docs: Correcting file names by aschilling-nv · Pull Request #161 · NVIDIA/NeMo-RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

docs: Correcting file names #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 15, 2025
Merged

docs: Correcting file names #161

merged 5 commits into from
Apr 15, 2025

Conversation

aschilling-nv
Copy link
Collaborator

What does this PR do ?

This PR converts file and folder names that include _ to - as part of SEO best pratices.

Issues

Discussed in Slack.

Usage

Doc structure changes.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

@aschilling-nv aschilling-nv requested a review from terrykong April 10, 2025 17:53
@aschilling-nv aschilling-nv self-assigned this Apr 10, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 10, 2025
@aschilling-nv aschilling-nv changed the title Correcting file names docs: Correcting file names Apr 10, 2025
@aschilling-nv aschilling-nv force-pushed the afs/file-names branch 2 times, most recently from 363a073 to a36ebe9 Compare April 10, 2025 19:30
@terrykong terrykong enabled auto-merge April 11, 2025 17:45
@terrykong terrykong added the Run CICD Set to run CI (unset + set to rerun) label Apr 11, 2025
terrykong
terrykong previously approved these changes Apr 14, 2025
@terrykong
Copy link
Collaborator

CI fails b/c doc only change, need this change: #181

@terrykong terrykong added Run CICD Set to run CI (unset + set to rerun) and removed Run CICD Set to run CI (unset + set to rerun) labels Apr 15, 2025
@terrykong terrykong added Run CICD Set to run CI (unset + set to rerun) and removed Run CICD Set to run CI (unset + set to rerun) labels Apr 15, 2025
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
@aschilling-nv aschilling-nv dismissed SahilJain314’s stale review April 15, 2025 17:08

Reverted my text changes to allow the technical writer to make the required changes.

@terrykong terrykong added Run CICD Set to run CI (unset + set to rerun) and removed Run CICD Set to run CI (unset + set to rerun) labels Apr 15, 2025
@terrykong terrykong added this pull request to the merge queue Apr 15, 2025
Merged via the queue into main with commit 8213014 Apr 15, 2025
12 checks passed
@terrykong terrykong deleted the afs/file-names branch April 15, 2025 20:40
terrykong added a commit that referenced this pull request May 1, 2025
commit ebb46c3
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:46 2025 -0700

    fix: fix dtype of empty `token_ids` for consistency (#290)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit cf8f045
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:19 2025 -0700

    chore: Remove outdated comment in DPO config (#293)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 04f30bb
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Wed Apr 30 12:19:47 2025 -0700

    fix: Fixed max seqlen not respected correctly (#299)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit daac5d9
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 29 17:30:05 2025 -0700

    chore: Remove online hf checkpointing (#285)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 3cd8be8
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 29 15:18:37 2025 -0700

    feat: Remove 'last 100' hack for math verifier (#287)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Co-authored-by: Terry Kong <terryk@nvidia.com>

commit 506910a
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 11:29:22 2025 -0700

    test: add a test that checks if recipes can be merged into the base config (#288)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af43261
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 09:18:14 2025 -0700

    chore: add isort rules and pyflakes in ruff/precommit (#291)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8b0837c
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 29 23:57:41 2025 +0800

    ci: add eval functional test (#269)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit 68beb6d
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 23:35:01 2025 -0700

    feat: rename ratio_eps_{min/max} to ratio_clip_{min/max} for clarity (#283)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 2f5d22f
Author: Hemil Desai <hemild@nvidia.com>
Date:   Mon Apr 28 16:09:00 2025 -0700

    feat: Add hydra style overrides to SFT (#208)

    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Co-authored-by: ashors1 <ashors@nvidia.com>

commit 8a22c44
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 15:11:03 2025 -0700

    feat: publish convergence/release runs (#214)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af94d43
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 15:02:19 2025 -0700

    fix: fixes #264 where tied weights check didn't work on fsdp1 (#284)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Parth Chadha <parth29@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 1363dba
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 12:44:56 2025 -0700

    fix: improve port selection and exiting early from ray.sub (#272)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 044f385
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Mon Apr 28 14:22:55 2025 -0500

    docs: Correcting build issues and CI (#270)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 0fae6bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 11:08:51 2025 -0700

    feat: Updated Name to NeMo RL (#265)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 34cae3a
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 08:16:51 2025 -0700

    fix: add bibtex entry (#273)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit ee0d2c8
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sat Apr 26 20:15:38 2025 -0700

    docs: instruct users to git clone before beginning (#257)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 09f5416
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 25 13:46:41 2025 -0700

    feat: E2E multi-turn RL example with a sliding puzzle game (#242)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 47e51d3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Fri Apr 25 10:13:59 2025 -0700

    chore: better logging when insufficient resources (#271)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 98473c6
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 22:28:05 2025 -0700

    fix: Update DPO and SFT configs to use dtensor (#256)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 2558444
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 11:02:26 2025 -0700

    fix: Fix fsdp1 grad clipping and log grad norm (#251)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit c8f0a01
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 17:58:43 2025 -0700

    docs: add qwen 32b instruction and add 0.3 planned features (#255)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0a5f31d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 17:49:06 2025 -0400

    fix: fix broken eval script (#253)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2f8a140
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 12:47:18 2025 -0700

    ci: L1 default and increase test time (#252)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 1c7cbd9
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 12:52:13 2025 -0400

    fix: use find_tied_parameters api from HF for tied weight keys (#250)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 1788e4c
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 22 22:05:53 2025 -0400

    fix: raise error if tied weights model is being trained with fsdp1 or… (#229)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: mckimn <nmckimpson@nvidia.com>

commit 1fa4c7a
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 16:38:50 2025 -0700

    fix: Fix indent in dtensor policy (#248)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit ed546ae
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Wed Apr 23 07:29:47 2025 +0800

    feat: streaming each dtensor in refit (#176)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>

commit 5c62657
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Tue Apr 22 14:14:40 2025 -0700

    feat: Importance sampling trick (#174)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit deaece6
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 22 12:39:35 2025 -0700

    feat: Add support for multi-turn generations and RL (tools, games, etc) (#218)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 1245c50
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 12:19:42 2025 -0700

    fix: Speed up DPO functional test (#241)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit af369a3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 22 12:17:03 2025 -0700

    fix: Move ray worker port range start from 20001 to 53001 (#235)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 756152c
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 10:02:34 2025 -0700

    feat: Support multi-epoch training in SFT (#177)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit bbdd671
Author: Anna Shors <ashors@nvidia.com>
Date:   Mon Apr 21 22:16:15 2025 -0700

    feat: DPO (#180)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 88bc0fd
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 21 17:31:23 2025 -0700

    ci: Remove external config from project (#200)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 4a2e126
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 21 17:34:59 2025 -0400

    fix: skip vllm p2p check since its flaky (#238)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 22af21c
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:29 2025 -0700

    feat: FSDP2 SFT (#206)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit e36f488
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:24 2025 -0700

    fix: Fix missing import (#222)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 98b7a90
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 10:06:09 2025 -0700

    docs: update docs everywhere to remove uv pip install which isn't reliable (#217)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit da191b4
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 07:56:55 2025 -0700

    feat: introduce a debug API for backoff and retries for RayVirtualCluster (#234)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8780093
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 18 17:06:54 2025 -0700

    feat: Add total logging of generations in training (#172)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit ce2d121
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Sat Apr 19 00:22:11 2025 +0800

    fix: fix chat_template in eval (#210)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit f8b6ba9
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 12:52:19 2025 -0700

    fix: grpo func test 10 step -> 3 step to speed up CI (#209)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 4a6f62b
Author: Gerald Shen <119401249+gshennvm@users.noreply.github.com>
Date:   Thu Apr 17 11:06:05 2025 -0700

    feat: Add FSDP2, DTensor SP/TP, activation checkpointing support (#131)

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 78a9834
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 10:03:34 2025 -0700

    fix: ci uses umask (#211)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 5ff10
8000
f6
Author: alexchiu <qiuzhaopeng@foxmail.com>
Date:   Thu Apr 17 08:38:45 2025 +0800

    fix: prevent division by zero in ClippedPGLossFn calculation (#166)

    Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 6db2f7a
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Wed Apr 16 15:53:12 2025 -0700

    feat: Fix CPU offloading + add options for FSDP offload and activation ckpting (#123)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit 62ac8d2
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 15:38:53 2025 -0500

    ci: Only include dependencies in test container (#203)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit b00fcc8
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 16 13:23:40 2025 -0700

    fix: chat template improvements (#148)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit df31f50
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 13:13:58 2025 -0500

    ci: Run tests only in merge queue or when labeled (#159)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>

commit e3af337
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 16 09:23:30 2025 -0700

    feat: Upgrade to vllm v1 runtime (#170)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: yuki <48991475+yuki-666@users.noreply.github.com>
    Co-authored-by: Anna Shors <ashors@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit dd7c2d7
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 16:00:04 2025 -0700

    fix: unit test script halts on first failure (#189)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 92c3f1d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 15 15:42:01 2025 -0700

    feat: add a unique seed for each vllm llm engine (#171)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2ae8935
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 14:41:21 2025 -0700

    docs: remove backticks from uv.md title (#179)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 9ac4e62
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 12:37:35 2025 -0700

    fix: convert DCP to HF script works without ray cluster (#185)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8213014
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Tue Apr 15 13:55:54 2025 -0500

    docs: Correcting file names (#161)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 4db3167
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 11:07:51 2025 -0700

    fix: default to less verbose logging + uv-venv log once per worker  (#141)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit bda6522
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 22:31:56 2025 -0700

    docs: run tests with --group test to avoid missing test deps (#188)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit c1fc972
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 20:43:51 2025 -0700

    ci: Update to include public/ folder for pages deployment (#182)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit e9812f1
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 14 20:05:46 2025 -0700

    fix: don't use cuda-graphs for vllm generation (#187)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit d7d4cd6
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 15:46:13 2025 -0700

    ci: labels for docs/L0/L1/L2 and run even if only doc test (#181)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0637511
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 15 05:24:07 2025 +0800

    feat: support arbitrary end_strings (#96)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit c99585c
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 14:44:43 2025 -0700

    fix: allow configuring ray ports in ray.sub in case conflict on cluster (#173)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit a5547f2
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 09:18:02 2025 -0700

    docs: Fix doc build warnings and add external CI config (#157)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 32953be
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Fri Apr 11 10:18:03 2025 -0700

    fix: always test vllm (#167)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit c00b8bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Thu Apr 10 22:38:40 2025 -0700

    test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>
terrykong pushed a commit that referenced this pull request May 1, 2025
Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162)

Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Tech pubs updates to file

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

fix typo

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Incorporated Reviewer Comments in ReadMe

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs updates to file

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech pups updates to resolve some threads

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech pubs updates to resolve some threads

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Tech Pubs minor edits to files

Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com>

Squashed commit of the following:

commit ebb46c3
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:46 2025 -0700

    fix: fix dtype of empty `token_ids` for consistency (#290)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit cf8f045
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 30 15:03:19 2025 -0700

    chore: Remove outdated comment in DPO config (#293)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 04f30bb
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Wed Apr 30 12:19:47 2025 -0700

    fix: Fixed max seqlen not respected correctly (#299)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit daac5d9
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 29 17:30:05 2025 -0700

    chore: Remove online hf checkpointing (#285)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 3cd8be8
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 29 15:18:37 2025 -0700

    feat: Remove 'last 100' hack for math verifier (#287)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Co-authored-by: Terry Kong <terryk@nvidia.com>

commit 506910a
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 11:29:22 2025 -0700

    test: add a test that checks if recipes can be merged into the base config (#288)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af43261
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 29 09:18:14 2025 -0700

    chore: add isort rules and pyflakes in ruff/precommit (#291)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8b0837c
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 29 23:57:41 2025 +0800

    ci: add eval functional test (#269)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit 68beb6d
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 23:35:01 2025 -0700

    feat: rename ratio_eps_{min/max} to ratio_clip_{min/max} for clarity (#283)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 2f5d22f
Author: Hemil Desai <hemild@nvidia.com>
Date:   Mon Apr 28 16:09:00 2025 -0700

    feat: Add hydra style overrides to SFT (#208)

    Signed-off-by: Hemil Desai <hemild@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Co-authored-by: ashors1 <ashors@nvidia.com>

commit 8a22c44
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 15:11:03 2025 -0700

    feat: publish convergence/release runs (#214)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit af94d43
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 15:02:19 2025 -0700

    fix: fixes #264 where tied weights check didn't work on fsdp1 (#284)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Parth Chadha <parth29@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 1363dba
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 28 12:44:56 2025 -0700

    fix: improve port selection and exiting early from ray.sub (#272)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 044f385
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Mon Apr 28 14:22:55 2025 -0500

    docs: Correcting build issues and CI (#270)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 0fae6bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Mon Apr 28 11:08:51 2025 -0700

    feat: Updated Name to NeMo RL (#265)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 34cae3a
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 28 08:16:51 2025 -0700

    fix: add bibtex entry (#273)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit ee0d2c8
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sat Apr 26 20:15:38 2025 -0700

    docs: instruct users to git clone before beginning (#257)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 09f5416
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 25 13:46:41 2025 -0700

    feat: E2E multi-turn RL example with a sliding puzzle game (#242)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 47e51d3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Fri Apr 25 10:13:59 2025 -0700

    chore: better logging when insufficient resources (#271)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 98473c6
Author: Anna Shors <ashors@nvidia.com>
Date:   Thu Apr 24 22:28:05 2025 -0700

    fix: Update DPO and SFT configs to use dtensor (#256)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit 2558444
Author: Anna Shors <ashor
6D40
s@nvidia.com>
Date:   Thu Apr 24 11:02:26 2025 -0700

    fix: Fix fsdp1 grad clipping and log grad norm (#251)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit c8f0a01
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 17:58:43 2025 -0700

    docs: add qwen 32b instruction and add 0.3 planned features (#255)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0a5f31d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 17:49:06 2025 -0400

    fix: fix broken eval script (#253)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2f8a140
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Wed Apr 23 12:47:18 2025 -0700

    ci: L1 default and increase test time (#252)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 1c7cbd9
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 23 12:52:13 2025 -0400

    fix: use find_tied_parameters api from HF for tied weight keys (#250)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 1788e4c
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 22 22:05:53 2025 -0400

    fix: raise error if tied weights model is being trained with fsdp1 or… (#229)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: mckimn <nmckimpson@nvidia.com>

commit 1fa4c7a
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 16:38:50 2025 -0700

    fix: Fix indent in dtensor policy (#248)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit ed546ae
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Wed Apr 23 07:29:47 2025 +0800

    feat: streaming each dtensor in refit (#176)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>

commit 5c62657
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Tue Apr 22 14:14:40 2025 -0700

    feat: Importance sampling trick (#174)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit deaece6
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Tue Apr 22 12:39:35 2025 -0700

    feat: Add support for multi-turn generations and RL (tools, games, etc) (#218)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit 1245c50
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 12:19:42 2025 -0700

    fix: Speed up DPO functional test (#241)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit af369a3
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 22 12:17:03 2025 -0700

    fix: Move ray worker port range start from 20001 to 53001 (#235)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 756152c
Author: Anna Shors <ashors@nvidia.com>
Date:   Tue Apr 22 10:02:34 2025 -0700

    feat: Support multi-epoch training in SFT (#177)

    Signed-off-by: ashors1 <ashors@nvidia.com>

commit bbdd671
Author: Anna Shors <ashors@nvidia.com>
Date:   Mon Apr 21 22:16:15 2025 -0700

    feat: DPO (#180)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 88bc0fd
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 21 17:31:23 2025 -0700

    ci: Remove external config from project (#200)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 4a2e126
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 21 17:34:59 2025 -0400

    fix: skip vllm p2p check since its flaky (#238)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 22af21c
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:29 2025 -0700

    feat: FSDP2 SFT (#206)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit e36f488
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Mon Apr 21 12:41:24 2025 -0700

    fix: Fix missing import (#222)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>

commit 98b7a90
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 10:06:09 2025 -0700

    docs: update docs everywhere to remove uv pip install which isn't reliable (#217)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit da191b4
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Sun Apr 20 07:56:55 2025 -0700

    feat: introduce a debug API for backoff and retries for RayVirtualCluster (#234)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8780093
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Fri Apr 18 17:06:54 2025 -0700

    feat: Add total logging of generations in training (#172)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

commit ce2d121
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Sat Apr 19 00:22:11 2025 +0800

    fix: fix chat_template in eval (#210)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>

commit f8b6ba9
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 12:52:19 2025 -0700

    fix: grpo func test 10 step -> 3 step to speed up CI (#209)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 4a6f62b
Author: Gerald Shen <119401249+gshennvm@users.noreply.github.com>
Date:   Thu Apr 17 11:06:05 2025 -0700

    feat: Add FSDP2, DTensor SP/TP, activation checkpointing support (#131)

    Signed-off-by: Gerald Shen <geshen@nvidia.com>

commit 78a9834
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Thu Apr 17 10:03:34 2025 -0700

    fix: ci uses umask (#211)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 5ff10f6
Author: alexchiu <qiuzhaopeng@foxmail.com>
Date:   Thu Apr 17 08:38:45 2025 +0800

    fix: prevent division by zero in ClippedPGLossFn calculation (#166)

    Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
    Signed-off-by: Alex Qiu <alexq@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit 6db2f7a
Author: Yi-Fu Wu <yifu.wu@gmail.com>
Date:   Wed Apr 16 15:53:12 2025 -0700

    feat: Fix CPU offloading + add options for FSDP offload and activation ckpting (#123)

    Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit 62ac8d2
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 15:38:53 2025 -0500

    ci: Only include dependencies in test container (#203)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>

commit b00fcc8
Author: Anna Shors <ashors@nvidia.com>
Date:   Wed Apr 16 13:23:40 2025 -0700

    fix: chat template improvements (#148)

    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Parth Chadha <pchadha@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit df31f50
Author: Charlie Truong <chtruong@nvidia.com>
Date:   Wed Apr 16 13:13:58 2025 -0500

    ci: Run tests only in merge queue or when labeled (#159)

    Signed-off-by: Charlie Truong <chtruong@nvidia.com>

commit e3af337
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Wed Apr 16 09:23:30 2025 -0700

    feat: Upgrade to vllm v1 runtime (#170)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>
    Signed-off-by: Charlie Truong <chtruong@nvidia.com>
    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Signed-off-by: ashors1 <ashors@nvidia.com>
    Signed-off-by: Anna Shors <ashors@nvidia.com>
    Signed-off-by: Terry Kong <terryk@nvidia.com>
    Signed-off-by: Sahil Jain <sahilj@nvidia.com>
    Co-authored-by: Charlie Truong <chtruong@nvidia.com>
    Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
    Co-authored-by: yuki <48991475+yuki-666@users.noreply.github.com>
    Co-authored-by: Anna Shors <ashors@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit dd7c2d7
Author: Terry Kong <terry
9E88
curtiskong@gmail.com>
Date:   Tue Apr 15 16:00:04 2025 -0700

    fix: unit test script halts on first failure (#189)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 92c3f1d
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Tue Apr 15 15:42:01 2025 -0700

    feat: add a unique seed for each vllm llm engine (#171)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit 2ae8935
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 14:41:21 2025 -0700

    docs: remove backticks from uv.md title (#179)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 9ac4e62
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 12:37:35 2025 -0700

    fix: convert DCP to HF script works without ray cluster (#185)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 8213014
Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com>
Date:   Tue Apr 15 13:55:54 2025 -0500

    docs: Correcting file names (#161)

    Signed-off-by: Andrew Schilling <aschilling@nvidia.com>

commit 4db3167
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Tue Apr 15 11:07:51 2025 -0700

    fix: default to less verbose logging + uv-venv log once per worker  (#141)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit bda6522
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 22:31:56 2025 -0700

    docs: run tests with --group test to avoid missing test deps (#188)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit c1fc972
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 20:43:51 2025 -0700

    ci: Update to include public/ folder for pages deployment (#182)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit e9812f1
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Mon Apr 14 20:05:46 2025 -0700

    fix: don't use cuda-graphs for vllm generation (#187)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit d7d4cd6
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 15:46:13 2025 -0700

    ci: labels for docs/L0/L1/L2 and run even if only doc test (#181)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit 0637511
Author: yuki <48991475+yuki-666@users.noreply.github.com>
Date:   Tue Apr 15 05:24:07 2025 +0800

    feat: support arbitrary end_strings (#96)

    Signed-off-by: Yuki Huang <yukih@nvidia.com>
    Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>

commit c99585c
Author: Terry Kong <terrycurtiskong@gmail.com>
Date:   Mon Apr 14 14:44:43 2025 -0700

    fix: allow configuring ray ports in ray.sub in case conflict on cluster (#173)

    Signed-off-by: Terry Kong <terryk@nvidia.com>

commit a5547f2
Author: mckimn <nmckimpson@nvidia.com>
Date:   Mon Apr 14 09:18:02 2025 -0700

    docs: Fix doc build warnings and add external CI config (#157)

    Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com>

commit 32953be
Author: Parth Chadha <pchadha@nvidia.com>
Date:   Fri Apr 11 10:18:03 2025 -0700

    fix: always test vllm (#167)

    Signed-off-by: Parth Chadha <pchadha@nvidia.com>

commit c00b8bc
Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com>
Date:   Thu Apr 10 22:38:40 2025 -0700

    test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162)

    Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Signed-off-by: Terry Kong <terryk@nvidia.com>
KiddoZhu pushed a commit that referenced this pull request May 6, 2025
Signed-off-by: Andrew Schilling <aschilling@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation Run CICD Set to run CI (unset + set to rerun)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0