-
pplx-kernels Public
Forked from ppl-ai/pplx-kernelsPerplexity GPU Kernels
C++ MIT License UpdatedMay 1, 2025 -
vllm Public
Forked from vllm-project/vllmTo learn vllm
Python Apache License 2.0 UpdatedApr 30, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedApr 30, 2025 -
MoBA Public
Forked from MoonshotAI/MoBAMoBA: Mixture of Block Attention for Long-Context LLMs
Python MIT License UpdatedMar 31, 2025 -
vattention Public
Forked from microsoft/vattentionDynamic Memory Management for Serving LLMs without PagedAttention
C MIT License UpdatedMar 20, 2025 -
accel-sim-framework Public
Forked from accel-sim/accel-sim-frameworkThis is the top-level repository for the Accel-Sim framework.
Python Other UpdatedFeb 25, 2025 -
-
TinyZero Public
Forked from Jiayi-Pan/TinyZeroClean, minimal, accessible reproduction of DeepSeek R1-Zero
Python Apache License 2.0 UpdatedFeb 1, 2025 -
-
yalm Public
Forked from andrewkchan/yalmYet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
C++ UpdatedDec 24, 2024 -
ThunderKittens Public
Forked from HazyResearch/ThunderKittensTile primitives for speedy kernels
Cuda MIT License UpdatedDec 13, 2024 -
tutorials Public
Forked from triton-inference-server/tutorialsThis repository contains tutorials and examples for Triton Inference Server
Python BSD 3-Clause "New" or "Revised" License UpdatedDec 7, 2024 -
-
lectures Public
Forked from gpu-mode/lecturesMaterial for cuda-mode lectures
Jupyter Notebook Apache License 2.0 UpdatedAug 11, 2024 -
interview_internal_reference Public
Forked from 0voice/interview_internal_reference2023年最新总结,阿里,腾讯,百度,美团,头条等技术面试题目,以及答案,专家出题人分析汇总。
Python UpdatedMay 20, 2024 -
any-precision-llm Public
Forked from SNU-ARC/any-precision-llm[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Python MIT License UpdatedMay 16, 2024 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedMay 15, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 14, 2024 -
HumanSystemOptimization Public
Forked from zijie0/HumanSystemOptimization健康学习到150岁 - 人体系统调优不完全指南
UpdatedMay 9, 2024 -
llm.c Public
Forked from karpathy/llm.cLLM training in simple, raw C/CUDA
Cuda MIT License UpdatedApr 13, 2024 -
-
cppbestpractices Public
Forked from cpp-best-practices/cppbestpracticesCollaborative Collection of C++ Best Practices. This online resource is part of Jason Turner's collection of C++ Best Practices resources. See README.md for more information.
Other UpdatedFeb 8, 2024 -
docs Public
Forked from PaddlePaddle/docsDocumentations for PaddlePaddle
Python Apache License 2.0 UpdatedDec 1, 2023 -
Paddle Public
Forked from PaddlePaddle/PaddlePArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
C++ UpdatedNov 29, 2023 -
jit-exportable-models Public
Forked from PaddleJitLab/jit-exportable-modelsShell UpdatedNov 20, 2023 -
Data-Structures-and-Algorithms-in-cpp Public
Forked from amritansh22/Data-Structures-and-Algorithms-in-cppTo learn cpp
C++ MIT License UpdatedOct 24, 2023 -
TensorRT Public
Forked from NVIDIA/TensorRTNVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applicat…
C++ Apache License 2.0 UpdatedOct 20, 2023 -
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedOct 10, 2023 -
FasterTransformer Public
Forked from NVIDIA/FasterTransformerTransformer related optimization, including BERT, GPT
C++ Apache License 2.0 UpdatedOct 2, 2023 -
SysML Public
Forked from Jack47/hack-SysMLThe road to hack SysML and become an system expert
Emacs Lisp UpdatedSep 26, 2023