Stars
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Official Repository of "Learning to Reason under Off-Policy Guidance"
An open source implementation of CLIP.
A high-throughput and memory-efficient inference and serving engine for LLMs
Writing AI Conference Papers: A Handbook for Beginners
Ongoing research training transformer models at scale
csuking / ms-swift
Forked from modelscope/ms-swiftUse PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…
Skywork-R1V2:Multimodal Hybrid Reinforcement Learning for Reasoning
Official Repo for Open-Reasoner-Zero
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
A fork to add multimodal model training to open-r1
verl: Volcano Engine Reinforcement Learning for LLMs
This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov
FlashMLA: Efficient MLA decoding kernels
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
Democratizing Reinforcement Learning for LLMs
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
Align Anything: Training All-modality Model with Feedback
A curated list of reinforcement learning with human feedback resources (continually updated)