Stars
- All languages
- Assembly
- Astro
- Blade
- C
- C#
- C++
- CSS
- Clojure
- Cuda
- Emacs Lisp
- Go
- HTML
- Handlebars
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- MATLAB
- Makefile
- Markdown
- Move
- Perl
- PowerShell
- Processing
- Python
- Ren'Py
- Roff
- Ruby
- Rust
- SCSS
- Scheme
- Shell
- Smalltalk
- TeX
- TypeScript
- V
- Verilog
- Vim Script
- Vim Snippet
- Vue
Ongoing research training transformer models at scale
This package contains the original 2012 AlexNet code.
verl: Volcano Engine Reinforcement Learning for LLMs
An Open-source RL System from ByteDance Seed and Tsinghua AIR
Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
EliteKV: Scalable KV Cache Compression via RoPE Frequency Selection and Joint Low-Rank Projection
Source code of paper: A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models. (ICML 2025)
Democratizing Reinforcement Learning for LLMs
Train transformer language models with reinforcement learning.
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Code for paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning
EvaByte: Efficient Byte-level Language Models at Scale
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
The official GitHub page for the survey paper "A Survey of RWKV".
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
FlashInfer: Kernel Library for LLM Serving
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Agent S: an open agentic framework that uses computers like a human
PyTorch Implementation for Hyperbolic Fine-tuning for LLMs
[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"