-
Linkedin
- Sunnyvale, CA
-
02:20
(UTC -07:00) - hebiao064.github.io
- in/biao-he
- @hebiao064
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
slime is a LLM post-training framework aiming at scaling RL.
hebiao064 / sglang
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
hebiao064 / verl
Forked from volcengine/verlverl: Volcano Engine Reinforcement Learning for LLMs
Allow torch tensor memory to be released and resumed later
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
📄 Awesome CV is LaTeX template for your outstanding job application
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
sgl-project / sgl-attn
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
A toolkit to run Ray applications on Kubernetes
Allow torch tensor memory to be released and resumed later
NVIDIA curated collection of educational resources related to general purpose GPU programming.
verl: Volcano Engine Reinforcement Learning for LLMs
A next generation Python CMake adaptor and Python API for plugins
Seamless operability between C++11 and Python
ademeure / DeeperGEMM
Forked from deepseek-ai/DeepGEMMDeeperGEMM: crazy optimized version
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
how to optimize some algorithm in cuda.
FlashInfer: Kernel Library for LLM Serving
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS
[ACL 2025] CoT-ICL Lab: A Synthetic Framework for Studying Chain-of-Thought Learning from In-Context Demonstrations