Lists (2)
Sort Name ascending (A-Z)
Stars
Suna - Open Source Generalist AI Agent
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A high-throughput and memory-efficient inference and serving engine for LLMs
FlashMLA: Efficient MLA decoding kernels
FlashInfer: Kernel Library for LLM Serving
SGLang is a fast serving framework for large language models and vision language models.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
LLMPerf is a library for validating and benchmarking LLMs
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
📚A curated list of Awesome LLM Inference Papers with Codes.
Awesome LLM compression research papers and tools.
📱 Collaborative List of Open-Source iOS Apps
A Comprehensive Toolkit for High-Quality PDF Content Extraction
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Fast and memory-efficient exact attention
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
SwiftUI-DesignCode is some examples in the process of learning swiftUI 2.0
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
Universal LLM Deployment Engine with ML Compilation
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
You like pytorch? You like micrograd? You love tinygrad! ❤️
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
Open-source simulator for autonomous driving research.
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
tonylt / tvm
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators