8000 gty111 (Tianyu Guo) · GitHub

More Web Proxy on the site http://driver.im/

gty111

Follow

🎯

Focusing

Tianyu Guo gty111

🎯

Focusing

Follow

Ph.D. student of Sun Yat-Sen University, prior intern @Tencent. Simulators, GPU, architecture, AI Infra, MLSys

93 followers · 92 following

Sun Yat-sen University
Guangzhou
https://gty111.github.io/info/

Achievements

Achievements

Highlights

Pro

gty111/README.md

PH.D. student at Sun Yat-sen university
AI Infra, MLSys, Simulaters, GPU architecture
Visit my personal web

News

[2025/4/27] We have released gLLM, an efficient pipeline parallelism inference engine for LLM.

PRs for Project

sglang: Fix port number overflow link
xDiT: Enable warm up for VAE link
xDiT: Fix parallel vae link
DistVAE: Fix batch dimension link
vLLM: [Benchmark] Refactor sample_requests in benchmark_throughput link
vLLM: [Bugfix] fix automatic prefix args and add log info link
vLLM: [Minor Fix] Fix comments in benchmark_serving link
vLLM: [Minor Fix] Remove unused code in benchmark_prefix_caching.py link
TVM: [Doc] Fix minor error in "Expressions in Relay" link
TVM: [Doc] Fix minor error in doc (Add an operator to Relay) link

Pinned Loading

gLLM gLLM Public

gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling

Python 11 1
PTX-EMU PTX-EMU Public

PTX-EMU is a simple emulator for CUDA program.

C++ 31 5
GEMM_MMA GEMM_MMA Public

Optimize GEMM with tensorcore step by step

26 5
SimpleUseGpgpuSim SimpleUseGpgpuSim Public

GPGPU-SIM 使用篇

Shell 14 1
GEMM_WMMA GEMM_WMMA Public

GEMM by WMMA (tensor core)

Cuda 12 7
ConvNN ConvNN Public

A simple CNN training framework support on CPU and GPU(CUDNN)

C++ 3

0