8000 hebiao064 (Stefan He) / Starred · GitHub

More Web Proxy on the site http://driver.im/

hebiao064

Follow

Stefan He hebiao064

Follow

37 followers · 47 following

Linkedin
Sunnyvale, CA
02:20 (UTC -07:00)
hebiao064.github.io
in/biao-he
@hebiao064

Achievements

Achievements

Lists (1)

Sort

🚀 My stack

Starred repositories

THUDM / slime

slime is a LLM post-training framework aiming at scaling RL.

Python 445 19 Updated Jun 25, 2025

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 116 2 Updated Jun 24, 2025

hebiao064 / sglang

Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 1 1 Updated Jun 19, 2025

hebiao064 / hebiao064

Config files for my GitHub profile.

1 Updated Jun 17, 2025

hebiao064 / hebiao064.github.io

SCSS 1 Updated Jun 24, 2025

hebiao064 / verl

Forked from volcengine/verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 1 1 Updated Jun 24, 2025

hebiao064 / torch_memory_saver

Forked from fzyzcjy/torch_memory_saver

Allow torch tensor memory to be released and resumed later

C++ 2 Updated Jun 17, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 3,920 413 Updated Jun 24, 2025

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 33,179 14,292 Updated Jun 25, 2025

posquit0 / Awesome-CV

📄 Awesome CV is LaTeX template for your outstanding job application

TeX 24,665 4,994 Updated Feb 6, 2025

NVIDIA / cudnn-frontend

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

C++ 582 122 Updated Jun 12, 2025

sgl-project / sgl-attn

Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python 10 5 Updated May 15, 2025

ray-project / kuberay

A toolkit to run Ray applications on Kubernetes

Go 1,844 549 Updated Jun 24, 2025

fzyzcjy / torch_memory_saver

Allow torch tensor memory to be released and resumed later

C++ 48 8 Updated Jun 17, 2025

inclusionAI / AReaL

Distributed RL System for LLM Reasoning

Python 1,855 98 Updated Jun 25, 2025

ml-explore / mlx-lm

Run LLMs with MLX

Python 1,139 143 Updated Jun 17, 2025

NVIDIA / accelerated-computing-hub

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook 538 93 Updated Jun 7, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 9,951 1,628 Updated Jun 25, 2025

scikit-build / scikit-build-core

A next generation Python CMake adaptor and Python API for plugins

Python 338 66 Updated Jun 19, 2025

pybind / pybind11

Seamless operability between C++11 and Python

C++ 16,749 2,187 Updated Jun 24, 2025

ademeure / DeeperGEMM

Forked from deepseek-ai/DeepGEMM

DeeperGEMM: crazy optimized version

Cuda 69 Updated May 5, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.

Cuda 4,883 534 Updated Jun 21, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,839 1,520 Updated Jun 25, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,321 105 Updated Jun 24, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,277 202 Updated Jun 25, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 3,239 352 Updated Jun 25, 2025

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 7,746 1,285 Updated Jun 12, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,472 626 Updated Jun 23, 2025

MekkCyber / CutlassAcademy

A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS

190 9 Updated May 6, 2025

kvignesh1420 / cot-icl-lab

[ACL 2025] CoT-ICL Lab: A Synthetic Framework for Studying Chain-of-Thought Learning from In-Context Demonstrations

Python 11 1 Updated May 23, 2025

Starred topics

Azure

0