Stars
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Train transformer language models with reinforcement learning.
Fully open reproduction of DeepSeek-R1
The simplest, fastest repository for training/finetuning small-sized VLMs.
SGLang is a fast serving framework for large language models and vision language models.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learning
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
verl: Volcano Engine Reinforcement Learning for LLMs
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
A high-throughput and memory-efficient inference and serving engine for LLMs
Minimal reproduction of DeepSeek R1-Zero
🧑🚀 全世界最好的LLM资料总结(视频生成、Agent、辅助编程、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.
Solve Visual Understanding with Reinforced VLMs
Witness the aha moment of VLM with less than $3.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
A fork to add multimodal model training to open-r1