Stars
SGLang is a fast serving framework for large language models and vision language models.
Minimalistic large language model 3D-parallelism training
A high-throughput and memory-efficient inference and serving engine for LLMs
Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Large Language Model Text Generation Inference
verl: Volcano Engine Reinforcement Learning for LLMs
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
Fully open reproduction of DeepSeek-R1
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Low-code framework for building custom LLMs, neural networks, and other AI models
Your shell history: synced, queryable, and in context