Stars
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Open source annotation tool for machine learning practitioners.
python based software to unpack kindlegen generated ebooks
SGLang is a fast serving framework for large language models and vision language models.
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Examples and guides for using the Gemini API
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
A fast, effective data attribution method for neural networks in PyTorch
SkyReels-V2: Infinite-length Film Generative model
MAGI-1: Autoregressive Video Generation at Scale
Official implementation of Inductive Moment Matching
Agno is a lightweight, high-performance library for building Agents with memory, knowledge, and reasoning.
verl: Volcano Engine Reinforcement Learning for LLMs
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
Train transformer language models with reinforcement learning.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
PyTorch building blocks for the OLMo ecosystem
Measuring Massive Multitask Language Understanding | ICLR 2021
QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.
Google TPU optimizations for transformers models
This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"
SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
FlashMLA: Efficient MLA decoding kernels