8000 AIBionics / Starred · GitHub

More Web Proxy on the site http://driver.im/

AIBionics

Follow

AIBionics

Follow

1 follower · 1 following

Stars

openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 16,171 2,715 Updated Dec 18, 2024

0russwest0 / Agent-R1

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 479 32 Updated May 13, 2025

NovaSky-AI / SkyThought

Sky-T1: Train your own O1 preview model within $450

Python 3,244 322 Updated May 18, 2025

TsinghuaC3I / Awesome-RL-Reasoning-Recipes

Awesome RL Reasoning Recipes ("Triple R")

547 31 Updated May 19, 2025

bytedance / SandboxFusion

Python 322 19 Updated Feb 7, 2025

Qihoo360 / Light-R1

Python 696 47 Updated Apr 15, 2025

lzhxmu / CPPO

CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models

Python 127 9 Updated May 3, 2025

inclusionAI / AReaL

Distributed RL System for LLM Reasoning

Python 1,260 57 Updated May 16, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,562 265 Updated Apr 10, 2025

PRIME-RL / PRIME

Scalable RL solution for advanced reasoning of language models

Python 1,561 92 Updated Mar 18, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 2,282 161 Updated May 16, 2025

OpenRLHF / OpenRLHF-M

An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.

Python 120 6 Updated Apr 7, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 1,778 148 Updated May 9, 2025

RUCAIBox / Slow_Thinking_with_LLMs

A series of technical report on Slow Thinking with LLM

Python 675 36 Updated Apr 13, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,920 98 Updated Apr 8, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 4,956 305 Updated May 11, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,459 2,252 Updated May 19, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,773 276 Updated May 15, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,668 769 Updated May 19, 2025

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,942 239 Updated Apr 30, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,726 375 Updated May 13, 2025

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 40,884 4,509 Updated May 17, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 2,205 133 Updated May 17, 2025

kvcache-ai / ktransformers

66AA A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,074 988 Updated May 18, 2025

RAGEN-AI / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,832 129 Updated May 16, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 14,446 1,778 Updated May 19, 2025

agentica-project / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,256 303 Updated May 13, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 8,173 967 Updated May 19, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,341 155 Updated Mar 20, 2025

magpie-align / magpie

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!

Python 699 62 Updated Mar 17, 2025

0