8000 shuxiaobo (Lu Junhao) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View shuxiaobo's full-sized avatar

Block or report shuxiaobo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

MiniCPM4: Ultra-Efficient LLMs on End Devices, achieving 5+ speedup on typical end-side chips

Jupyter Notebook 7,942 491 Updated Jun 12, 2025

A PyTorch native platform for training generative AI models

Python 3,918 392 Updated Jun 15, 2025

LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/l…

Python 822 63 Updated Dec 3, 2024

Pretraining code for a large-scale depth-recurrent language model

Python 781 65 Updated Jun 12, 2025
Python 2 Updated May 11, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,142 2,007 Updated Jun 15, 2025

A pytorch quantization backend for optimum

Python 950 73 Updated May 22, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 1,491 148 Updated Jun 15, 2025

An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"

Python 311 30 Updated Sep 16, 2024

Efficient 2:4 sparse training algorithms and implementations

Python 54 Updated Dec 8, 2024

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Python 806 105 Updated Aug 20, 2024

[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models

Python 167 12 Updated Jan 1, 2025

A simple and effective LLM pruning approach.

Python 760 107 Updated Aug 9, 2024
Python 233 31 Updated Nov 9, 2022

Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Python 174 21 Updated Jun 14, 2025

A fast MoE impl for PyTorch

Python 1,744 196 Updated Feb 10, 2025

The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".

370 20 Updated Mar 12, 2025
Jupyter Notebook 134 14 Updated Apr 29, 2025

A series of technical report on Slow Thinking with LLM

Python 695 39 Updated Jun 9, 2025

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 696 31 Updated Mar 19, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,804 297 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,444 617 Updated Jun 11, 2025

Efficient triton implementation of Native Sparse Attention.

Python 166 8 Updated May 23, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,818 277 Updated May 15, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,769 791 Updated Jun 13, 2025

maximal update parametrization (µP)

Jupyter Notebook 1,541 102 Updated Jul 17, 2024

some common Huggingface transformers in maximal update parametrization (µP)

Jupyter Notebook 80 11 Updated Mar 14, 2022

Foundation Architecture for (M)LLMs

Python 3,083 219 Updated Apr 11, 2024

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Python 298 22 Updated Feb 23, 2025

[ACM CoNEXT22 Best Paper Award] NTSocks: An ultra-low latency and compatible PCIe interconnect for rack-scale disaggregation.

C 38 3 Updated Jul 11, 2024
Next
2D84
0