-
torchtitan Public
Forked from pytorch/torchtitanA native PyTorch Library for large model training
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 9, 2025 -
llm-analysis Public
Latency and Memory Analysis of Transformer Models for Training and Inference
-
-
Diffusion-Models-pytorch Public
Forked from tcapelle/Diffusion-Models-pytorchPytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)
Jupyter Notebook Apache License 2.0 UpdatedNov 25, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedNov 5, 2024 -
-
VILA Public
Forked from NVlabs/VILAVILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Python Apache License 2.0 UpdatedOct 31, 2024 -
attorch Public
Forked from BobMcDear/attorchA subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Python MIT License UpdatedOct 25, 2024 -
-
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedOct 14, 2024 -
llm-compressor Public
Forked from vllm-project/llm-compressorTransformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Python Apache License 2.0 UpdatedAug 23, 2024 -
marlin Public
Forked from IST-DASLab/marlinFP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Python Apache License 2.0 UpdatedAug 15, 2024 -
sampleproject Public template
Forked from pypa/sampleprojectA sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"
Python MIT License UpdatedAug 6, 2024 -
transformers Public
Forked from huggingface/transformers🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Python Apache License 2.0 UpdatedJul 23, 2024 -
minRF Public
Forked from cloneofsimo/minRFMinimal implementation of scalable rectified flow transformers, based on SD3's approach
Jupyter Notebook Apache License 2.0 UpdatedJul 1, 2024 -
theme-academic-cv Public template
Forked from HugoBlox/theme-academic-cv🎓 无需编写任何代码即可轻松创建漂亮的学术网站 Easily create a beautiful academic résumé or educational website using Hugo and GitHub. No code.
TeX MIT License UpdatedJun 2, 2024 -
llm-foundry Public
Forked from mosaicml/llm-foundryLLM training code for MosaicML foundation models
Python Apache License 2.0 UpdatedMay 28, 2024 -
composer Public
Forked from mosaicml/composerSupercharge Your Model Training
-
-
KVQuant Public
Forked from SqueezeAILab/KVQuantKVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Python UpdatedMar 13, 2024 -
AutoGPTQ Public
Forked from AutoGPTQ/AutoGPTQAn easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Python MIT License UpdatedMar 13, 2024 -
TransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Python Apache License 2.0 UpdatedMar 12, 2024 -
-
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
C++ Apache License 2.0 UpdatedFeb 26, 2024 -
nccl-tests Public
Forked from NVIDIA/nccl-testsNCCL Tests
Cuda BSD 3-Clause "New" or "Revised" License UpdatedFeb 26, 2024 -
-
DeepSpeed Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Python Apache License 2.0 UpdatedNov 28, 2023 -
transformer_framework Public
Forked from lessw2020/transformer_frameworkframework for plug and play of various transformers (vision and nlp) with FSDP
Python Apache License 2.0 UpdatedSep 27, 2023 -
superbenchmark Public
Forked from microsoft/superbenchmarkA validation and profiling tool for AI infrastructure
Python MIT License UpdatedAug 22, 2023 -
llm-awq Public
Forked from mit-han-lab/llm-awqAWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python MIT License UpdatedAug 2, 2023