8000 Release Release v0.1.0 · NVIDIA/NeMo-RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Release v0.1.0

Compare
Choose a tag to compare
@terrykong terrykong released this 22 Mar 00:07
· 162 commits to main since this release
9ee564e

Release v0.1.0

  • Fast Generation - vLLM backend for optimized inference
  • HuggingFace Integration - Works with 1-8B models (Qwen1.5, Llama)
  • Distributed Training - FSDP support and Ray-based infrastructure
  • Environment Support - Support for multi-environment training.
  • Learning Algorithms - GRPO (Group Relative Policy Optimization) and SFT (Supervised Fine-Tuning)
  • Worker Isolation - Process isolation between RL Actors (no worries about global state)

What's Changed

New Contributors

Known Issues

  • There is a known bug with SFT checkpointing that requires the full model to be gathered on GPU before saving a checkpoint. This causes OOM for larger model sizes. If you run into OOM when checkpointing, disable checkpointing by adding checkpointing.enabled=False to your run command.

Full Changelog: https://github.com/NVIDIA/NeMo-RL/commits/v0.1.0

0