8000 AIBionics / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View AIBionics's full-sized avatar

Block or report AIBionics

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 16,171 2,715 Updated Dec 18, 2024

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 479 32 Updated May 13, 2025

Sky-T1: Train your own O1 preview model within $450

Python 3,244 322 Updated May 18, 2025

Awesome RL Reasoning Recipes ("Triple R")

547 31 Updated May 19, 2025
Python 322 19 Updated Feb 7, 2025
Python 696 47 Updated Apr 15, 2025

CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models

Python 127 9 Updated May 3, 2025

Distributed RL System for LLM Reasoning

Python 1,260 57 Updated May 16, 2025

Simple RL training for reasoning

Python 3,562 265 Updated Apr 10, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,561 92 Updated Mar 18, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 2,282 161 Updated May 16, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.

Python 120 6 Updated Apr 7, 2025

Fully open data curation for reasoning models

Python 1,778 148 Updated May 9, 2025

A series of technical report on Slow Thinking with LLM

Python 675 36 Updated Apr 13, 2025

Official Repo for Open-Reasoner-Zero

Python 1,920 98 Updated Apr 8, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,956 305 Updated May 11, 2025

Fully open reproduction of DeepSeek-R1

Python 24,459 2,252 Updated May 19, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,773 276 Updated May 15, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,668 769 Updated May 19, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

3,942 239 Updated Apr 30, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,726 375 Updated May 13, 2025

Making large AI models cheaper, faster and more accessible

Python 40,884 4,509 Updated May 17, 2025

My learning notes/codes for ML SYS.

Python 2,205 133 Updated May 17, 2025

66AA A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,074 988 Updated May 18, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,832 129 Updated May 16, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 14,446 1,778 Updated May 19, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,256 303 Updated May 13, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 8,173 967 Updated May 19, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,341 155 Updated Mar 20, 2025

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!

Python 699 62 Updated Mar 17, 2025
Next
0