8000 WentseChen / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View WentseChen's full-sized avatar

Block or report WentseChen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

verl: Volcano Engine Reinforcement Learning for LLMs

Python 9,639 1,495 Updated Jun 18, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 373 22 Updated Jun 16, 2025
JavaScript 3,525 499 Updated Jun 16, 2025

Distributed RL System for LLM Reasoning

Python 1,786 89 Updated Jun 18, 2025

[ICLR 2023] SQA3D for embodied scene understanding and reasoning

Python 133 5 Updated Oct 13, 2023
JavaScript 1 Updated Mar 29, 2025

The homepage of SilongYong

HTML 1 Updated Mar 29, 2025

[ICLR 2025] OMG for material modeling in Gaussian Splatting

C++ 8 1 Updated Mar 29, 2025

[NeurIPS 2024] GL-NeRF for training-free ANY NeRF acceleration

Python 4 Updated Mar 29, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agent RL)

Python 7,117 691 Updated Jun 17, 2025

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 219 9 Updated May 5, 2025

The official implementation of Self-Play Fine-Tuning (SPIN)

Python 1,164 102 Updated May 8, 2024

Build your own visual reasoning model

Jupyter Notebook 381 21 Updated Jun 18, 2025

Paper list for Efficient Reasoning.

500 19 Updated Jun 18, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,994 147 Updated Jun 3, 2025

Official Repo for Open-Reasoner-Zero

Python 1,967 104 Updated Jun 2, 2025

Robust recipes to align language models with human and AI preferences

Python 5,228 448 Updated Apr 30, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 2,614 191 Updated Jun 18, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,704 199 Updated Jun 18, 2025

Simple RL training for reasoning

Python 3,628 271 Updated Apr 10, 2025

Benchmarking Agentic LLM and VLM Reasoning On Games

Python 151 28 Updated May 7, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,913 1,489 Updated Apr 24, 2025

Minimal hackable GRPO implementation

Python 243 35 Updated Jan 31, 2025
Jupyter Notebook 663 77 Updated Apr 30, 2025

Soft-QMIX: Integrating Maximum Entropy For Monotonic Value Function Factorization

Python 13 1 Updated Jul 3, 2024

Easily fine-tune, evaluate and deploy Qwen3, DeepSeek-R1, Llama 4 or any open source LLM / VLM!

Python 8,190 612 Updated Jun 17, 2025

RL starter files in order to immediately train, visualize and evaluate an agent without writing any line of code

Python 687 187 Updated May 12, 2024

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Rust 7,554 452 Updated Jun 18, 2025

Anthropic's educational courses

Jupyter Notebook 15,542 1,326 Updated Nov 26, 2024

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 17,750 1,776 Updated Jun 17, 2025
Next
0