8000 pokerfaceSad (XinYuan) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View pokerfaceSad's full-sized avatar
😶
Talk Is Cheap
😶
Talk Is Cheap

Block or report pokerfaceSad

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,058 8,409 Updated Jun 30, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,379 455 Updated Jun 30, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 14,158 2,043 Updated Jun 17, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,634 872 Updated Apr 29, 2025

A PyTorch Native LLM Training Framework

Python 824 49 Updated Dec 27, 2024

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.🎉

Cuda 5,180 551 Updated Jun 29, 2025

Step-by-step optimization of CUDA SGEMM

Cuda 346 46 Updated Mar 30, 2022

learning how CUDA works

Cuda 273 38 Updated Mar 3, 2025

Simple tutorials on Pytorch DDP training

Python 281 49 Updated Aug 19, 2022

CUDA checkpoint and restore utility

C 344 19 Updated Jan 27, 2025

collection of benchmarks to measure basic GPU capabilities

C++ 386 55 Updated Feb 11, 2025

HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container

C 173 87 Updated Jun 27, 2025

The road to hack SysML and become an system expert

Emacs Lisp 489 61 Updated Sep 25, 2024

LLM Inference benchmark

Python 421 39 Updated Jul 23, 2024

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML AF3B 18,920 2,255 Updated Jun 22, 2025

GLake: optimizing GPU memory management and IO transmission.

Python 469 41 Updated Mar 24, 2025

Practical GPU Sharing Without Memory Size Constraints

C 272 29 Updated Mar 28, 2025

Hooked CUDA-related dynamic libraries by using automated code generation tools.

C 158 43 Updated Dec 12, 2023

K8s-club for learn, share and explore the K8s world :)

497 97 Updated May 7, 2025

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration

Go 4,891 957 Updated Jun 30, 2025

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Jupyter Notebook 14,357 3,353 Updated Aug 12, 2024

Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)

Go 1,836 322 Updated Jun 30, 2025

NVIDIA Linux open GPU kernel module source

C 15,933 1,424 Updated Jun 17, 2025

Awesome resources for GPUs

572 54 Updated Jul 1, 2023

An awesome & curated list of best LLMOps tools for developers

Shell 5,035 486 Updated Jun 24, 2025

A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.

Go 1,535 370 Updated Jun 30, 2025

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 90,408 10,307 Updated Jun 30, 2025

技术面试最后反问面试官的话

18,164 1,383 Updated Mar 4, 2024

极简主义团队管理操作手册

598 31 Updated Apr 8, 2023

A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod

Go 126 29 Updated Feb 23, 2022
Next
0