Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 7,861 667 Updated May 31, 2025

BytedTsinghua-SIA / DAPO

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,280 52 Updated May 11, 2025

RUCAIBox / Slow_Thinking_with_LLMs

A series of technical report on Slow Thinking with LLM

Python 682 39 Updated May 27, 2025

eddycmu / demystify-long-cot

Python 293 18 Updated May 31, 2025

bruno686 / Awesome-RL-based-LLM-Reasoning

Awesome RL-based LLM Reasoning

506 27 Updated May 4, 2025

All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More

Python 56,973 6,415 Updated May 31, 2025

huggingface / Math-Verify

Python 731 32 Updated Apr 28, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,418 748 Updated May 19, 2025

agentica-project / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,306 305 Updated May 13, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,795 277 Updated May 15, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,938 101 Updated Apr 8, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 2,325 144 Updated May 31, 2025

AccumulateMore / CV

✔（已完结）最全面的深度学习笔记【土堆 Pytorch】【李沐动手学深度学习】【吴恩达深度学习】

Jupyter Notebook 10,486 1,274 Updated May 29, 2025

wangshusen / DRL

Deep Reinforcement Learning

3,916 627 Updated Dec 10, 2022

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 8,805 1,098 Updated May 31, 2025

GAIR-NLP / LIMO

LIMO: Less is More for Reasoning

Python 954 47 Updated Apr 6, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,347 155 Updated Mar 20, 2025

philschmid / deep-learning-pytorch-huggingface

Jupyter Notebook 1,199 243 Updated Feb 27, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,601 267 Updated Apr 10, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,625 2,276 Updated May 28, 2025

vietnh1009 / ASCII-generator

ASCII generator (image to text, image to image, video to video)

Python 7,872 607 Updated Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

waterwaterrr

Achievements

Achievements

Block or report waterwaterrr

Stars

safety-research / circuit-tracer

NJU-RL / GLIDER

hkust-nlp / Laser

Zeyi-Lin / HivisionIDPhotos

malody2014 / llm_benchmark

KCORES / kcores-llm-arena

ByteDance-Seed / VeOmni

ByteDance-Seed / ByteCheckpoint

Eclipsess / Awesome-Efficient-Reasoning-LLMs

modelscope / ms-swift