Stars
APOLLO: SGD-like Memory, AdamW-level Performance
ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
The official implementation of ICLR 2025 Workshop paper "DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs".
DFloat11: Lossless LLM Compression for Efficient GPU Inference
High-Resolution Image Synthesis with Latent Diffusion Models
zzhbrr / mlsys-arxiv-daily
Forked from Vincentqyw/cv-arxiv-daily🎓Automatically Update MLSys Papers Daily using Github Actions (Update Every 12th hours)
arXiv.org non-official badge implementation for Markdown files
Turn your README(s) into a basic static site
Master programming by recreating your favorite technologies from scratch.
Analyze computation-communication overlap in V3/R1.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
Production-tested AI infrastructure tools for efficient AGI development and commu 81F8 nity-driven innovation
A high-throughput and memory-efficient inference and serving engine for LLMs
Using GitHub Action to collect paper list with publicly available source code in the daily arxiv
This is a plugin of nonebot2, subscribing arxiv RSS and pushing new papers daily.
ChatGPT/Gemini/DeepSeek based personalized arXiv paper assistant bot for automatic paper filtering. Powerful, free, and easy-to-use.
Remote Access your GitHub Actions via Browser Based VS Code