Stars
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Sky-T1: Train your own O1 preview model within $450
Awesome RL Reasoning Recipes ("Triple R")
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
Scalable RL solution for advanced reasoning of language models
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.
Fully open data curation for reasoning models
A series of technical report on Slow Thinking with LLM
Official Repo for Open-Reasoner-Zero
Solve Visual Understanding with Reinforced VLMs
Fully open reproduction of DeepSeek-R1
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
DeepEP: an efficient expert-parallel communication library
A curated list of reinforcement learning with human feedback resources (continually updated)
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Making large AI models cheaper, faster and more accessible
My learning notes/codes for ML SYS.
66AA A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
SGLang is a fast serving framework for large language models and vision language models.
Democratizing Reinforcement Learning for LLMs
verl: Volcano Engine Reinforcement Learning for LLMs
[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!