Stars
Collection of Docker images with headless VNC environments
DeepEP: an efficient expert-parallel communication library
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
A Datacenter Scale Distributed Inference Serving Framework
NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
TripoSR: Fast 3D Object Reconstruction from a Single Image
No fortress, purely open ground. OpenManus is Coming.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Awesome-LLM: a curated list of Large Language Model
A general fine-tuning kit geared toward diffusion models.
Optimized primitives for collective multi-GPU communication
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
A generative speech model for daily dialogue.
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
Inference and training library for high-quality TTS models.
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
SGLang is a fast serving framework for large language models and vision language models.
llama3 implementation one matrix multiplication at a time
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.