-
University of Electronic Science and Technology of China
- Cheng Du
- space.keter.host
Highlights
- Pro
Lists (9)
Sort Name ascending (A-Z)
Starred repositories
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
thunlp / Seq1F1B
Forked from NVIDIA/Megatron-LMSequence-level 1F1B schedule for LLMs.
Lightweight coding agent that runs in your terminal
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
Robust Speech Recognition via Large-Scale Weak Supervision
DeepEP: an efficient expert-parallel communication library
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge managemen…
Fully open reproduction of DeepSeek-R1
Puzzles for learning Triton, play it with minimal environment configuration!
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
A smarter cd command. Supports all major shells.
Unified KV Cache Compression Methods for Auto-Regressive Models
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference"
📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, Parallelism, MLA, etc.
AirLLM 70B inference with single 4GB GPU
SGLang is a fast serving framework for large language models and vision language models.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A high-throughput and memory-efficient inference and serving engine for LLMs