-
KAIST AI
- Seoul, Republic of Korea
- http://www.raymin0223.com
Highlights
- Pro
Stars
(ICLR 2025) TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling
Hackable and optimized Transformers building blocks, supporting a composable construction.
Causal depthwise conv1d in CUDA, with a PyTorch interface
[ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Official Repository of the paper "Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models"
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
Ongoing research training transformer models at scale
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
A PyTorch native platform for training generative AI models
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Paper list for Efficient Reasoning.
verl: Volcano Engine Reinforcement Learning for LLMs
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
A MAD laboratory to improve AI architecture designs 🧪
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Democratizing Reinforcement Learning for LLMs
Janus-Series: Unified Multimodal Understanding and Generation Models