Stars
Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).
This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…
ReasonFlux Series - Open-Sourced LLM Family for Reasoning, Coding, Reward Modeling and Data Selection
A flexible and efficient training framework for large-scale alignment tasks
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
verl: Volcano Engine Reinforcement Learning for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Zhejiang University Graduation Thesis LaTeX Template
code for Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
Production-ready data processing made easy and shareable
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Diffusion Model-Based Image Editing: A Survey (TPAMI 2025)
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning