Stars
Fast and memory-efficient exact attention
SGLang is a fast serving framework for large language models and vision language models.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
ML4ML: Automated Invariance Testing for Machine Learning Models
A RocksDB compatible KV storage engine with better performance
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
A library for efficient similarity search and clustering of dense vectors.
A framework for distributed systems verification, with fault injection
Diff algorithm that understands HTML, in the browser.