-
Notifications
You must be signed in to change notification settings - Fork 25
fix: Fixed max seqlen not respected correctly #299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
parthchadha
reviewed
Apr 30, 2025
parthchadha
previously approved these changes
Apr 30, 2025
Signed-off-by: Sahil Jain <sahilj@nvidia.com>
terrykong
approved these changes
Apr 30, 2025
terrykong
added a commit
that referenced
this pull request
May 1, 2025
commit ebb46c3 Author: Anna Shors <ashors@nvidia.com> Date: Wed Apr 30 15:03:46 2025 -0700 fix: fix dtype of empty `token_ids` for consistency (#290) Signed-off-by: ashors1 <ashors@nvidia.com> commit cf8f045 Author: Anna Shors <ashors@nvidia.com> Date: Wed Apr 30 15:03:19 2025 -0700 chore: Remove outdated comment in DPO config (#293) Signed-off-by: ashors1 <ashors@nvidia.com> commit 04f30bb Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Wed Apr 30 12:19:47 2025 -0700 fix: Fixed max seqlen not respected correctly (#299) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit daac5d9 Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 29 17:30:05 2025 -0700 chore: Remove online hf checkpointing (#285) Signed-off-by: ashors1 <ashors@nvidia.com> commit 3cd8be8 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Tue Apr 29 15:18:37 2025 -0700 feat: Remove 'last 100' hack for math verifier (#287) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com> commit 506910a Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 29 11:29:22 2025 -0700 test: add a test that checks if recipes can be merged into the base config (#288) Signed-off-by: Terry Kong <terryk@nvidia.com> commit af43261 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 29 09:18:14 2025 -0700 chore: add isort rules and pyflakes in ruff/precommit (#291) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 8b0837c Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Tue Apr 29 23:57:41 2025 +0800 ci: add eval functional test (#269) Signed-off-by: Yuki Huang <yukih@nvidia.com> commit 68beb6d Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Mon Apr 28 23:35:01 2025 -0700 feat: rename ratio_eps_{min/max} to ratio_clip_{min/max} for clarity (#283) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit 2f5d22f Author: Hemil Desai <hemild@nvidia.com> Date: Mon Apr 28 16:09:00 2025 -0700 feat: Add hydra style overrides to SFT (#208) Signed-off-by: Hemil Desai <hemild@nvidia.com> Signed-off-by: ashors1 <ashors@nvidia.com> Co-authored-by: ashors1 <ashors@nvidia.com> commit 8a22c44 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 28 15:11:03 2025 -0700 feat: publish convergence/release runs (#214) Signed-off-by: Terry Kong <terryk@nvidia.com> commit af94d43 Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 28 15:02:19 2025 -0700 fix: fixes #264 where tied weights check didn't work on fsdp1 (#284) Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Parth Chadha <parth29@gmail.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit 1363dba Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 28 12:44:56 2025 -0700 fix: improve port selection and exiting early from ray.sub (#272) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 044f385 Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com> Date: Mon Apr 28 14:22:55 2025 -0500 docs: Correcting build issues and CI (#270) Signed-off-by: Andrew Schilling <aschilling@nvidia.com> commit 0fae6bc Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Mon Apr 28 11:08:51 2025 -0700 feat: Updated Name to NeMo RL (#265) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit 34cae3a Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 28 08:16:51 2025 -0700 fix: add bibtex entry (#273) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit ee0d2c8 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Sat Apr 26 20:15:38 2025 -0700 docs: instruct users to git clone before beginning (#257) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 09f5416 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Fri Apr 25 13:46:41 2025 -0700 feat: E2E multi-turn RL example with a sliding puzzle game (#242) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit 47e51d3 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Fri Apr 25 10:13:59 2025 -0700 chore: better logging when insufficient resources (#271) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 98473c6 Author: Anna Shors <ashors@nvidia.com> Date: Thu Apr 24 22:28:05 2025 -0700 fix: Update DPO and SFT configs to use dtensor (#256) Signed-off-by: ashors1 <ashors@nvidia.com> commit 2558444 Author: Anna Shors <ashors@nvidia.com> Date: Thu Apr 24 11:02:26 2025 -0700 fix: Fix fsdp1 grad clipping and log grad norm (#251) Signed-off-by: ashors1 <ashors@nvidia.com> commit c8f0a01 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Wed Apr 23 17:58:43 2025 -0700 docs: add qwen 32b instruction and add 0.3 planned features (#255) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 0a5f31d Author: Parth Chadha <pchadha@nvidia.com> Date: Wed Apr 23 17:49:06 2025 -0400 fix: fix broken eval script (#253) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 2f8a140 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Wed Apr 23 12:47:18 2025 -0700 ci: L1 default and increase test time (#252) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 1c7cbd9 Author: Parth Chadha <pchadha@nvidia.com> Date: Wed Apr 23 12:52:13 2025 -0400 fix: use find_tied_parameters api from HF for tied weight keys (#250) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 1788e4c Author: Parth Chadha <pchadha@nvidia.com> Date: Tue Apr 22 22:05:53 2025 -0400 fix: raise error if tied weights model is being trained with fsdp1 or… (#229) Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: mckimn <nmckimpson@nvidia.com> commit 1fa4c7a Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 22 16:38:50 2025 -0700 fix: Fix indent in dtensor policy (#248) Signed-off-by: ashors1 <ashors@nvidia.com> commit ed546ae Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Wed Apr 23 07:29:47 2025 +0800 feat: streaming each dtensor in refit (#176) Signed-off-by: Yuki Huang <yukih@nvidia.com> Signed-off-by: Alex Qiu <alexq@nvidia.com> Signed-off-by: Parth Chadha <pchadha@nvidia.com> Co-authored-by: Alex Qiu <alexq@nvidia.com> Co-authored-by: Parth Chadha <pchadha@nvidia.com> commit 5c62657 Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Tue Apr 22 14:14:40 2025 -0700 feat: Importance sampling trick (#174) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit deaece6 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Tue Apr 22 12:39:35 2025 -0700 feat: Add support for multi-turn generations and RL (tools, games, etc) (#218) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit 1245c50 Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 22 12:19:42 2025 -0700 fix: Speed up DPO functional test (#241) Signed-off-by: ashors1 <ashors@nvidia.com> commit af369a3 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 22 12:17:03 2025 -0700 fix: Move ray worker port range start from 20001 to 53001 (#235) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 756152c Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 22 10:02:34 2025 -0700 feat: Support multi-epoch training in SFT (#177) Signed-off-by: ashors1 <ashors@nvidia.com> commit bbdd671 Author: Anna Shors <ashors@nvidia.com> Date: Mon Apr 21 22:16:15 2025 -0700 feat: DPO (#180) Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Anna Shors <ashors@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com> commit 88bc0fd Author: mckimn <nmckimpson@nvidia.com> Date: Mon Apr 21 17:31:23 2025 -0700 ci: Remove external config from project (#200) Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> commit 4a2e126 Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 21 17:34:59 2025 -0400 fix: skip vllm p2p check since its flaky (#238) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 22af21c Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Mon Apr 21 12:41:29 2025 -0700 feat: FSDP2 SFT (#206) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> commit e36f488 Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Mon Apr 21 12:41:24 2025 -0700 fix: Fix missing import (#222) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> commit 98b7a90 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Sun Apr 20 10:06:09 2025 -0700 docs: update docs everywhere to remove uv pip install which isn't reliable (#217) Signed-off-by: Terry Kong <terryk@nvidia.com> commit da191b4 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Sun Apr 20 07:56:55 2025 -0700 feat: introduce a debug API for backoff and retries for RayVirtualCluster (#234) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 8780093 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Fri Apr 18 17:06:54 2025 -0700 feat: Add total logging of generations in training (#172) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit ce2d121 Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Sat Apr 19 00:22:11 2025 +0800 fix: fix chat_template in eval (#210) Signed-off-by: Yuki Huang <yukih@nvidia.com> commit f8b6ba9 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Thu Apr 17 12:52:19 2025 -0700 fix: grpo func test 10 step -> 3 step to speed up CI (#209) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 4a6f62b Author: Gerald Shen <119401249+gshennvm@users.noreply.github.com> Date: Thu Apr 17 11:06:05 2025 -0700 feat: Add FSDP2, DTensor SP/TP, activation checkpointing support (#131) Signed-off-by: Gerald Shen <geshen@nvidia.com> commit 78a9834 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Thu Apr 17 10:03:34 2025 -0700 fix: ci uses umask (#211) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 5ff10f6 Author: alexchiu <qiuzhaopeng@foxmail.com> Date: Thu Apr 17 08:38:45 2025 +0800 fix: prevent division by zero in ClippedPGLossFn calculation (#166) Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com> Signed-off-by: Alex Qiu <alexq@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit 6db2f7a Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Wed Apr 16 15:53:12 2025 -0700 feat: Fix CPU offloading + add options for FSDP offload and activation ckpting (#123) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: Parth Chadha <pchadha@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> commit 62ac8d2 Author: Charlie Truong <chtruong@nvidia.com> Date: Wed Apr 16 15:38 10000 :53 2025 -0500 ci: Only include dependencies in test container (#203) Signed-off-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> commit b00fcc8 Author: Anna Shors <ashors@nvidia.com> Date: Wed Apr 16 13:23:40 2025 -0700 fix: chat template improvements (#148) Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Sahil Jain <sahilj@nvidia.com> Co-authored-by: Parth Chadha <pchadha@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit df31f50 Author: Charlie Truong <chtruong@nvidia.com> Date: Wed Apr 16 13:13:58 2025 -0500 ci: Run tests only in merge queue or when labeled (#159) Signed-off-by: Charlie Truong <chtruong@nvidia.com> commit e3af337 Author: Parth Chadha <pchadha@nvidia.com> Date: Wed Apr 16 09:23:30 2025 -0700 feat: Upgrade to vllm v1 runtime (#170) Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Yuki Huang <yukih@nvidia.com> Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Anna Shors <ashors@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Sahil Jain <sahilj@nvidia.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: yuki <48991475+yuki-666@users.noreply.github.com> Co-authored-by: Anna Shors <ashors@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit dd7c2d7 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 16:00:04 2025 -0700 fix: unit test script halts on first failure (#189) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 92c3f1d Author: Parth Chadha <pchadha@nvidia.com> Date: Tue Apr 15 15:42:01 2025 -0700 feat: add a unique seed for each vllm llm engine (#171) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 2ae8935 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 14:41:21 2025 -0700 docs: remove backticks from uv.md title (#179) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 9ac4e62 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 12:37:35 2025 -0700 fix: convert DCP to HF script works without ray cluster (#185) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 8213014 Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com> Date: Tue Apr 15 13:55:54 2025 -0500 docs: Correcting file names (#161) Signed-off-by: Andrew Schilling <aschilling@nvidia.com> commit 4db3167 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 11:07:51 2025 -0700 fix: default to less verbose logging + uv-venv log once per worker (#141) Signed-off-by: Terry Kong <terryk@nvidia.com> commit bda6522 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 14 22:31:56 2025 -0700 docs: run tests with --group test to avoid missing test deps (#188) Signed-off-by: Terry Kong <terryk@nvidia.com> commit c1fc972 Author: mckimn <nmckimpson@nvidia.com> Date: Mon Apr 14 20:43:51 2025 -0700 ci: Update to include public/ folder for pages deployment (#182) Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> commit e9812f1 Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 14 20:05:46 2025 -0700 fix: don't use cuda-graphs for vllm generation (#187) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit d7d4cd6 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 14 15:46:13 2025 -0700 ci: labels for docs/L0/L1/L2 and run even if only doc test (#181) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 0637511 Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Tue Apr 15 05:24:07 2025 +0800 feat: support arbitrary end_strings (#96) Signed-off-by: Yuki Huang <yukih@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit c99585c Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 14 14:44:43 2025 -0700 fix: allow configuring ray ports in ray.sub in case conflict on cluster (#173) Signed-off-by: Terry Kong <terryk@nvidia.com> commit a5547f2 Author: mckimn <nmckimpson@nvidia.com> Date: Mon Apr 14 09:18:02 2025 -0700 docs: Fix doc build warnings and add external CI config (#157) Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> commit 32953be Author: Parth Chadha <pchadha@nvidia.com> Date: Fri Apr 11 10:18:03 2025 -0700 fix: always test vllm (#167) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit c00b8bc Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Thu Apr 10 22:38:40 2025 -0700 test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com>
terrykong
pushed a commit
that referenced
this pull request
May 1, 2025
Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Tech pubs updates to file Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> fix typo Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Incorporated Reviewer Comments in ReadMe Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs updates to files Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs updates to files Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs updates to files Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs updates to files Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs updates to files Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs updates to file Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech pups updates to resolve some threads Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech pubs updates to resolve some threads Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Tech Pubs minor edits to files Signed-off-by: Jennifer Gerhold <163925524+jgerh@users.noreply.github.com> Squashed commit of the following: commit ebb46c3 Author: Anna Shors <ashors@nvidia.com> Date: Wed Apr 30 15:03:46 2025 -0700 fix: fix dtype of empty `token_ids` for consistency (#290) Signed-off-by: ashors1 <ashors@nvidia.com> commit cf8f045 Author: Anna Shors <ashors@nvidia.com> Date: Wed Apr 30 15:03:19 2025 -0700 chore: Remove outdated comment in DPO config (#293) Signed-off-by: ashors1 <ashors@nvidia.com> commit 04f30bb Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Wed Apr 30 12:19:47 2025 -0700 fix: Fixed max seqlen not respected correctly (#299) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit daac5d9 Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 29 17:30:05 2025 -0700 chore: Remove online hf checkpointing (#285) Signed-off-by: ashors1 <ashors@nvidia.com> commit 3cd8be8 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Tue Apr 29 15:18:37 2025 -0700 feat: Remove 'last 100' hack for math verifier (#287) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Co-authored-by: Terry Kong <terryk@nvidia.com> commit 506910a Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 29 11:29:22 2025 -0700 test: add a test that checks if recipes can be merged into the base config (#288) Signed-off-by: Terry Kong <terryk@nvidia.com> commit af43261 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 29 09:18:14 2025 -0700 chore: add isort rules and pyflakes in ruff/precommit (#291) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 8b0837c Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Tue Apr 29 23:57:41 2025 +0800 ci: add eval functional test (#269) Signed-off-by: Yuki Huang <yukih@nvidia.com> commit 68beb6d Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Mon Apr 28 23:35:01 2025 -0700 feat: rename ratio_eps_{min/max} to ratio_clip_{min/max} for clarity (#283) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit 2f5d22f Author: Hemil Desai <hemild@nvidia.com> Date: Mon Apr 28 16:09:00 2025 -0700 feat: Add hydra style overrides to SFT (#208) Signed-off-by: Hemil Desai <hemild@nvidia.com> Signed-off-by: ashors1 <ashors@nvidia.com> Co-authored-by: ashors1 <ashors@nvidia.com> commit 8a22c44 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 28 15:11:03 2025 -0700 feat: publish convergence/release runs (#214) Signed-off-by: Terry Kong <terryk@nvidia.com> commit af94d43 Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 28 15:02:19 2025 -0700 fix: fixes #264 where tied weights check didn't work on fsdp1 (#284) Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Parth Chadha <parth29@gmail.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit 1363dba Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 28 12:44:56 2025 -0700 fix: improve port selection and exiting early from ray.sub (#272) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 044f385 Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com> Date: Mon Apr 28 14:22:55 2025 -0500 docs: Correcting build issues and CI (#270) Signed-off-by: Andrew Schilling <aschilling@nvidia.com> commit 0fae6bc Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Mon Apr 28 11:08:51 2025 -0700 feat: Updated Name to NeMo RL (#265) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit 34cae3a Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 28 08:16:51 2025 -0700 fix: add bibtex entry (#273) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit ee0d2c8 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Sat Apr 26 20:15:38 2025 -0700 docs: instruct users to git clone before beginning (#257) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 09f5416 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Fri Apr 25 13:46:41 2025 -0700 feat: E2E multi-turn RL example with a sliding puzzle game (#242) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit 47e51d3 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Fri Apr 25 10:13:59 2025 -0700 chore: better logging when insufficient resources (#271) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 98473c6 Author: Anna Shors <ashors@nvidia.com> Date: Thu Apr 24 22:28:05 2025 -0700 fix: Update DPO and SFT configs to use dtensor (#256) Signed-off-by: ashors1 <ashors@nvidia.com> commit 2558444 Author: Anna Shors <ashors@nvidia.com> Date: Thu Apr 24 11:02:26 2025 -0700 fix: Fix fsdp1 grad clipping and log grad norm (#251) Signed-off-by: ashors1 <ashors@nvidia.com> commit c8f0a01 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Wed Apr 23 17:58:43 2025 -0700 docs: add qwen 32b instruction and add 0.3 planned features (#255) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 0a5f31d Author: Parth Chadha <pchadha@nvidia.com> Date: Wed Apr 23 17:49:06 2025 -0400 fix: fix broken eval script (#253) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 2f8a140 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Wed Apr 23 12:47:18 2025 -0700 ci: L1 default and increase test time (#252) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 1c7cbd9 Author: Parth Chadha <pchadha@nvidia.com> Date: Wed Apr 23 12:52:13 2025 -0400 fix: use find_tied_parameters api from HF for tied weight keys (#250) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 1788e4c Author: Parth Chadha <pchadha@nvidia.com> Date: Tue Apr 22 22:05:53 2025 -0400 fix: raise error if tied weights model is being trained with fsdp1 or… (#229) Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: mckimn <nmckimpson@nvidia.com> commit 1fa4c7a Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 22 16:38:50 2025 -0700 fix: Fix indent in dtensor policy (#248) Signed-off-by: ashors1 <ashors@nvidia.com> commit ed546ae Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Wed Apr 23 07:29:47 2025 +0800 feat: streaming each dtensor in refit (#176) Signed-off-by: Yuki Huang <yukih@nvidia.com> Signed-off-by: Alex Qiu <alexq@nvidia.com> Signed-off-by: Parth Chadha <pchadha@nvidia.com> Co-authored-by: Alex Qiu <alexq@nvidia.com> Co-authored-by: Parth Chadha <pchadha@nvidia.com> commit 5c62657 Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Tue Apr 22 14:14:40 2025 -0700 feat: Importance sampling trick (#174) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit deaece6 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Tue Apr 22 12:39:35 2025 -0700 feat: Add support for multi-turn generations and RL (tools, games, etc) (#218) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit 1245c50 Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 22 12:19:42 2025 -0700 fix: Speed up DPO functional test (#241) Signed-off-by: ashors1 <ashors@nvidia.com> commit af369a3 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 22 12:17:03 2025 -0700 fix: Move ray worker port range start from 20001 to 53001 (#235) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 756152c Author: Anna Shors <ashors@nvidia.com> Date: Tue Apr 22 10:02:34 2025 -0700 feat: Support multi-epoch training in SFT (#177) Signed-off-by: ashors1 <ashors@nvidia.com> commit bbdd671 Author: Anna Shors <ashors@nvidia.com> Date: Mon Apr 21 22:16:15 2025 -0700 feat: DPO (#180) Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Anna Shors <ashors@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: Yi-Fu Wu <yifu.wu@gmail.com> commit 88bc0fd Author: mckimn <nmckimpson@nvidia.com> Date: Mon Apr 21 17:31:23 2025 -0700 ci: Remove external config from project (#200) Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> commit 4a2e126 Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 21 17:34:59 2025 -0400 fix: skip vllm p2p check since its flaky (#238) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 22af21c Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Mon Apr 21 12:41:29 2025 -0700 feat: FSDP2 SFT (#206) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> commit e36f488 Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Mon Apr 21 12:41:24 2025 -0700 fix: Fix missing import (#222) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> commit 98b7a90 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Sun Apr 20 10:06:09 2025 -0700 docs: update docs everywhere to remove uv pip install which isn't reliable (#217) Signed-off-by: Terry Kong <terryk@nvidia.com> commit da191b4 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Sun Apr 20 07:56:55 2025 -0700 feat: introduce a debug API for backoff and retries for RayVirtualCluster (#234) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 8780093 Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Fri Apr 18 17:06:54 2025 -0700 feat: Add total logging of generations in training (#172) Signed-off-by: Sahil Jain <sahilj@nvidia.com> commit ce2d121 Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Sat Apr 19 00:22:11 2025 +0800 fix: fix chat_template in eval (#210) Signed-off-by: Yuki Huang <yukih@nvidia.com> commit f8b6ba9 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Thu Apr 17 12:52:19 2025 -0700 fix: grpo func test 10 step -> 3 step to speed up CI (#209) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 4a6f62b Author: Gerald Shen <119401249+gshennvm@users.noreply.github.com> Date: Thu Apr 17 11:06:05 2025 -0700 feat: Add FSDP2, DTensor SP/TP, activation checkpointing support (#131) Signed-off-by: Gerald Shen <geshen@nvidia.com> commit 78a9834 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Thu Apr 17 10:03:34 2025 -0700 fix: ci uses umask (#211) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 5ff10f6 Author: alexchiu <qiuzhaopeng@foxmail.com> Date: Thu Apr 17 08:38:45 2025 +0800 fix: prevent division by zero in ClippedPGLossFn calculation (#166) Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com> Signed-off-by: Alex Qiu <alexq@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit 6db2f7a Author: Yi-Fu Wu <yifu.wu@gmail.com> Date: Wed Apr 16 15:53:12 2025 -0700 feat: Fix CPU offloading + add options for FSDP offload and activation ckpting (#123) Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com> Co-authored-by: Parth Chadha <pchadha@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> commit 62ac8d2 Author: Charlie Truong <chtruong@nvidia.com> Date: Wed Apr 16 15:38:53 2025 -0500 ci: Only include dependencies in test container (#203) Signed-off-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> commit b00fcc8 Author: Anna Shors <ashors@nvidia.com> Date: Wed Apr 16 13:23:40 2025 -0700 fix: chat template improvements (#148) Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Sahil Jain <sahilj@nvidia.com> Co-authored-by: Parth Chadha <pchadha@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit df31f50 Author: Charlie Truong <chtruong@nvidia.com> Date: Wed Apr 16 13:13:58 2025 -0500 ci: Run tests only in merge queue or when labeled (#159) Signed-off-by: Charlie Truong <chtruong@nvidia.com> commit e3af337 Author: Parth Chadha <pchadha@nvidia.com> Date: Wed Apr 16 09:23:30 2025 -0700 feat: Upgrade to vllm v1 runtime (#170) Signed-off-by: Parth Chadha <pchadha@nvidia.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: Yuki Huang <yukih@nvidia.com> Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Anna Shors <ashors@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Sahil Jain <sahilj@nvidia.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Co-authored-by: yuki <48991475+yuki-666@users.noreply.github.com> Co-authored-by: Anna Shors <ashors@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit dd7c2d7 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 16:00:04 2025 -0700 fix: unit test script halts on first failure (#189) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 92c3f1d Author: Parth Chadha <pchadha@nvidia.com> Date: Tue Apr 15 15:42:01 2025 -0700 feat: add a unique seed for each vllm llm engine (#171) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit 2ae8935 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 14:41:21 2025 -0700 docs: remove backticks from uv.md title (#179) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 9ac4e62 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 12:37:35 2025 -0700 fix: convert DCP to HF script works without ray cluster (#185) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 8213014 Author: Andrew Schilling <85314306+aschilling-nv@users.noreply.github.com> Date: Tue Apr 15 13:55:54 2025 -0500 docs: Correcting file names (#161) Signed-off-by: Andrew Schilling <aschilling@nvidia.com> commit 4db3167 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Tue Apr 15 11:07:51 2025 -0700 fix: default to less verbose logging + uv-venv log once per worker (#141) Signed-off-by: Terry Kong <terryk@nvidia.com> commit bda6522 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 14 22:31:56 2025 -0700 docs: run tests with --group test to avoid missing test deps (#188) Signed-off-by: Terry Kong <terryk@nvidia.com> commit c1fc972 Author: mckimn <nmckimpson@nvidia.com> Date: Mon Apr 14 20:43:51 2025 -0700 ci: Update to include public/ folder for pages deployment (#182) Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> commit e9812f1 Author: Parth Chadha <pchadha@nvidia.com> Date: Mon Apr 14 20:05:46 2025 -0700 fix: don't use cuda-graphs for vllm generation (#187) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit d7d4cd6 Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 14 15:46:13 2025 -0700 ci: labels for docs/L0/L1/L2 and run even if only doc test (#181) Signed-off-by: Terry Kong <terryk@nvidia.com> commit 0637511 Author: yuki <48991475+yuki-666@users.noreply.github.com> Date: Tue Apr 15 05:24:07 2025 +0800 feat: support arbitrary end_strings (#96) Signed-off-by: Yuki Huang <yukih@nvidia.com> Co-authored-by: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> commit c99585c Author: Terry Kong <terrycurtiskong@gmail.com> Date: Mon Apr 14 14:44:43 2025 -0700 fix: allow configuring ray ports in ray.sub in case conflict on cluster (#173) Signed-off-by: Terry Kong <terryk@nvidia.com> commit a5547f2 Author: mckimn <nmckimpson@nvidia.com> Date: Mon Apr 14 09:18:02 2025 -0700 docs: Fix doc build warnings and add external CI config (#157) Signed-off-by: Nathan McKimpson <nmckimpson@nvidia.com> commit 32953be Author: Parth Chadha <pchadha@nvidia.com> Date: Fri Apr 11 10:18:03 2025 -0700 fix: always test vllm (#167) Signed-off-by: Parth Chadha <pchadha@nvidia.com> commit c00b8bc Author: Sahil Jain <48468750+SahilJain314@users.noreply.github.com> Date: Thu Apr 10 22:38:40 2025 -0700 test: Add grpo/reinforce/ppo loss tests (prep for incoming vocab parallel changes) (#162) Signed-off-by: Sahil Jain <sahilj@nvidia.com> Signed-off-by: Terry Kong <terryk@nvidia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.