Stars
MCP-Zero: Active Tool Discovery for Autonomous LLM Agents
slime is a LLM post-training framework aiming at scaling RL.
LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
A Python implementation of Hidden Topic Markov Model
c/ua is the Docker Container for Computer-Use AI Agents.
Official Repository of Absolute Zero Reasoner
Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures
Latest Advances on System-2 Reasoning
This project is a **proof of concept** that aims to replicate the reasoning capabilities of OpenAI's newly released O1 model.
A package for sampling from Gibbs distributions during inference with LLMs.
Example models using DeepSpeed
Clustering for arbitrary data and dissimilarity function
Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥
Accessible large language models via k-bit quantization for PyTorch.
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
A Unified Semi-Supervised Learning Codebase (NeurIPS'22)