More
Stars
Tile primitives for speedy kernels
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
GameStream client for PCs (Windows, Mac, Linux, and Steam Link)
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
DeepGNN is a framework for training machine learning models on large scale graph data.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4