raymin0223

Sangmin Bae raymin0223

Ph.D. at KAIST AI; Prev Intern at Google DeepMind, Kakao; Interests in LLM Inference Acceleration, Foundation Model Training, Multimodal Learning

70 followers · 76 following

KAIST AI
Seoul, Republic of Korea
http://www.raymin0223.com

Achievements

Highlights

Stars

yandex-research / tabm

(ICLR 2025) TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling

Python 397 34 Updated Jun 13, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,601 679 Updated Jun 18, 2025

Dao-AILab / causal-conv1d

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 494 106 Updated May 26, 2025

HomebrewML / HeavyBall

Efficient optimizers

Python 214 18 Updated Jun 15, 2025

bingreeky / MaAS

[ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet

Python 85 11 Updated Jun 10, 2025

Weixin-Liang / Mixture-of-Mamba

Python 43 3 Updated Jan 28, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,588 256 Updated Jun 18, 2025

sungnyun / avsr-temporal-dynamics

(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

Python 12 Updated Oct 22, 2024

YangYongJin / FiFA-Official-Repo

Official Repository of the paper "Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models"

Python 2 Updated Apr 2, 2025

kyegomez / Jamba

PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"

Python 171 12 Updated Apr 4, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,602 2,854 Updated Jun 19, 2025

jxiw / MambaInLlama

[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Python 221 19 Updated May 3, 2025

shufangxun / LLaVA-MoD

[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

Python 172 11 Updated Mar 31, 2025

sungnyun / cav2vec

(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation

Python 11 Updated Apr 29, 2025

PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

C++ 22,875 5,749 Updated Jun 19, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 3,936 399 Updated Jun 19, 2025

Eclipsess / Awesome-Efficient-Reasoning-LLMs

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

451 13 Updated Jun 16, 2025

microsoft / AutoMoE

AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers

Python 47 4 Updated Oct 21, 2022

state-spaces / mamba

Mamba SSM architecture

Python 15,122 1,338 Updated May 25, 2025

hemingkx / Awesome-Efficient-Reasoning

Paper list for Efficient Reasoning.

500 19 Updated Jun 18, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 9,654 1,532 Updated Jun 19, 2025

PrompTartLab / WithPet

반려동물 동반 시설 가이드 챗봇 개발

Python 4 4 Updated Apr 19, 2025

astramind-ai / Mixture-of-depths

Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Python 163 7 Updated Jun 20, 2024

simplescaling / s1

s1: Simple test-time scaling

Python 6,448 749 Updated May 19, 2025

athms / mad-lab

A MAD laboratory to improve AI architecture designs 🧪

Python 120 13 Updated Dec 17, 2024

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,629 290 Updated Jun 18, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,824 277 Updated May 15, 2025

agentica-project / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,381 311 Updated May 13, 2025

da03 / Internalize_CoT_Step_by_Step

Python 180 18 Updated Apr 19, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,373 2,237 Updated Feb 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sangmin Bae raymin0223

Achievements

Achievements

Highlights

Block or report raymin0223

Stars

yandex-research / tabm

facebookresearch / xformers

Dao-AILab / causal-conv1d

HomebrewML / HeavyBall

bingreeky / MaAS

Weixin-Liang / Mixture-of-Mamba

facebookresearch / lingua

sungnyun / avsr-temporal-dynamics

YangYongJin / FiFA-Official-Repo

kyegomez / Jamba

NVIDIA / Megatron-LM

jxiw / MambaInLlama

shufangxun / LLaVA-MoD

sungnyun / cav2vec

PaddlePaddle / Paddle

pytorch / torchtitan

Eclipsess / Awesome-Efficient-Reasoning-LLMs

microsoft / AutoMoE

state-spaces / mamba

hemingkx / Awesome-Efficient-Reasoning

volcengine / verl

PrompTartLab / WithPet

astramind-ai / Mixture-of-depths

simplescaling / s1

athms / mad-lab

huggingface / lighteval

deepseek-ai / open-infra-index

agentica-project / rllm

da03 / Internalize_CoT_Step_by_Step

deepseek-ai / Janus