Stars
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
My learning notes/codes for ML SYS.
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
No fortress, purely open ground. OpenManus is Coming.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
MoBA: Mixture of Block Attention for Long-Context LLMs
🦜🔗 Build context-aware reasoning applications
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide
Integrate the DeepSeek API into popular softwares
Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
The official GitHub page for the survey paper "A Survey of RWKV".
Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
Implementation of GateLoop Transformer in Pytorch and Jax
Possibly futile attempt at grounding hype with theory and fundamentals
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
Train transformer language models with reinforcement learning.