-
Peking University
- https://light-of-hers.github.io
- https://www.zhihu.com/people/yi-guang-99-48
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding code links.
Universal battlefield-adaptive Operator Evaluation Protocol for Arknights / 泛用型环境自适应干员强度评价体系 for 明日方舟
[DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"
Processed / Cleaned Data for Paper Copilot
PDF2zh for Zotero | Zotero PDF中文翻译插件
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
H2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference
Open deep learning compiler stack for Kendryte AI accelerators ✨
MAGI-1: Autoregressive Video Generation at Scale
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
This is an experimental library that has evolved to P2688
match(it): A lightweight single-header pattern-matching library for C++17 with macro-free APIs.
Python interface for MLIR - the Multi-Level Intermediate Representation
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
Distributed Compiler Based on Triton for Parallel Systems
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
Simple, Elegant, Typed Argument Parsing with argparse
XAttention: Block Sparse Attention with Antidiagonal Scoring
ademeure / DeeperGEMM
Forked from deepseek-ai/DeepGEMMDeeperGEMM: crazy optimized version
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
Democratizing Reinforcement Learning for LLMs