Stars
a toolkit on knowledge distillation for large language models
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
Synthetic data curation for post-training and structured data extraction
Open-source Multi-agent Poster Generation from Papers
GitHub Action to build and push Docker images with Buildx
RM-R1: Unleashing the Reasoning Potential of Reward Models
verl: Volcano Engine Reinforcement Learning for LLMs
A vue component for add annotation to the picture (vue 图片批注组件)
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
Official code repo for our work "Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models"
An automated pipeline for evaluating LLMs for role-playing.
Arena-Hard-Auto: An automatic LLM benchmark.
Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
SGLang is a fast serving framework for large language models and vision language models.
Do Large Language Models Know What They Don’t Know?
LVBench: An Extreme Long Video Understanding Benchmark
Frontier Multimodal Foundation Models for Image and Video Understanding
Reference implementation for DPO (Direct Preference Optimization)
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.
[CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads