8000 lambda7xx (Xiao) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View lambda7xx's full-sized avatar
  • Shanghai Jiao Tong University
  • Shanghai

Organizations

@cs61

Block or report lambda7xx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Big & Small LLMs working together

Python 1,001 113 Updated Jun 25, 2025
Python 257 17 Updated May 1, 2024

Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"

13 Updated Jun 19, 2025
Go 22 3 Updated Apr 28, 2025
Python 8 2 Updated Jun 22, 2025
Python 7 1 Updated Jun 25, 2025
Python 6 1 Updated Jun 18, 2025

[ISCA'25] LIA: A Single-GPU LLM Inference Acceleration with Cooperative AMX-Enabled CPU-GPU Computation and CXL Offloading

Python 5 1 Updated Jun 15, 2025

Artifact of Chimera

Python 8 1 Updated May 6, 2025

Efficient Compute-Communication Overlap for Distributed LLM Inference

Python 13 Updated Jun 25, 2025
Python 80 11 Updated Nov 25, 2024

slime is a LLM post-training framework aiming at scaling RL.

Python 453 20 Updated Jun 25, 2025

Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)

Python 238 15 Updated Jun 22, 2025

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python 32 2 Updated Jun 16, 2025

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

Python 2,316 163 Updated Jun 19, 2025
Python 5 1 Updated Apr 5, 2025

各地房价热力图,杭州、北京、上海、苏州、天津、成都、南京、长沙、无锡、南宁、太原、青岛、南昌、郑州

HTML 18 7 Updated Jan 5, 2020

Nano vLLM

Python 4,057 434 Updated Jun 24, 2025

a simple API to use CUPTI

C++ 10 1 Updated Dec 16, 2024

《架构整洁之道》中文翻译

Shell 742 307 Updated Jan 15, 2025

A list of works on video generation towards world model

153 2 Updated Jun 22, 2025
Python 84 4 Updated May 22, 2025

Learn CUDA with PyTorch

Cuda 27 3 Updated Jun 23, 2025

Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)

Python 86 2 Updated Jun 11, 2025

The repository for ATC'25 paper "Greyhound: Hunting Fail-Slows in Hybrid-Parallel Training at Scale"

Python 6 1 Updated May 4, 2025

[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"

Python 163 11 Updated Mar 4, 2025

A version of verl to support tool use

Python 260 15 Updated Jun 26, 2025
Next
0