lambda7xx

Xiao lambda7xx

Build Systems 千里之行, 始于足下

272 followers · 1.3k following

Shanghai Jiao Tong University
Shanghai

Achievements

x2 x2

Achievements

x2 x2

Organizations

Lists (1)

Sort

LLM Serving

5 repositories

Stars

HazyResearch / minions

Big & Small LLMs working together

Python 1,001 113 Updated Jun 25, 2025

FasterDecoding / SnapKV

Python 257 17 Updated May 1, 2024

princeton-pli / PruLong

Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"

13 Updated Jun 19, 2025

sarchlab / triosim

Go 22 3 Updated Apr 28, 2025

google / rago

Python 8 2 Updated Jun 22, 2025

S4AI-CornellTech / Hermes

Python 7 1 Updated Jun 25, 2025

hwnam831 / meshslice

Python 6 1 Updated Jun 18, 2025

Hyungyo1 / LIA_AMXGPU

[ISCA'25] LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading

Python 5 1 Updated Jun 15, 2025

redbird-arch / isca2025-chimera-artifact

Artifact of Chimera

Python 8 1 Updated May 6, 2025

microsoft / tokenweave

Efficient Compute-Communication Overlap for Distributed LLM Inference

Python 13 Updated Jun 25, 2025

MooreThreads / TurboRAG

Python 80 11 Updated Nov 25, 2024

THUDM / slime

slime is a LLM post-training framework aiming at scaling RL.

Python 453 20 Updated Jun 25, 2025

ByteDance-Seed / SeedVR

Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)

Python 238 15 Updated Jun 22, 2025

THUDM / TreeRL

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python 32 2 Updated Jun 16, 2025

MiniMax-AI / MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

Python 2,316 163 Updated Jun 19, 2025

aquaml / aqua

Python 5 1 Updated Apr 5, 2025

aWangami / HouseHeatMap

各地房价热力图，杭州、北京、上海、苏州、天津、成都、南京、长沙、无锡、南宁、太原、青岛、南昌、郑州

HTML 18 7 Updated Jan 5, 2020

furiosa-ai / draft-based-approx-llm

6 Updated Jun 19, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 4,057 434 Updated Jun 24, 2025

wu-kan / wuk_cupti_wrapper

a simple API to use CUPTI

C++ 10 1 Updated Dec 16, 2024

zhusq20 / Awesome-SWE-tutorial

1 Updated Jun 12, 2025

leewaiho / Clean-Architecture-zh

《架构整洁之道》中文翻译

Shell 742 307 Updated Jan 15, 2025

ziqihuangg / Awesome-From-Video-Generation-to-World-Model

A list of works on video generation towards world model

153 2 Updated Jun 22, 2025

horseee / dKV-Cache

Python 84 4 Updated May 22, 2025

HanGuo97 / log-linear-attention

Python 218 12 Updated Jun 6, 2025

gau-nernst / learn-cuda

Learn CUDA with PyTorch

Cuda 27 3 Updated Jun 23, 2025

snu-mllab / KVzip

Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)

Python 86 2 Updated Jun 11, 2025

wutianyuan1 / Greyhound

The repository for ATC'25 paper "Greyhound: Hunting Fail-Slows in Hybrid-Parallel Training at Scale"

Python 6 1 Updated May 4, 2025

HKUNLP / diffusion-of-thoughts

[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"

Python 163 11 Updated Mar 4, 2025

TIGER-AI-Lab / verl-tool

A version of verl to support tool use

Python 260 15 Updated Jun 26, 2025