Stars
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
TradingAgents: Multi-Agents LLM Financial Trading Framework
[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement learning, and text-only reinforcement learning—to …
This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]
The official repo of One RL to See Them All: Visual Triple Uni C138 fied Reinforcement Learning
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…
Paper list for Efficient Reasoning.
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
[ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Align Anything: Training All-modality Model with Feedback
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Model Stock: All we need is just a few fine-tuned models
Tools for merging pretrained large language models.
Interpretable Contrastive Monte Carlo Tree Search Reasoning