8000 raymin0223 (Sangmin Bae) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View raymin0223's full-sized avatar

Highlights

  • Pro

Block or report raymin0223

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

(ICLR 2025) TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling

Python 397 34 Updated Jun 13, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,601 679 Updated Jun 18, 2025

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 494 106 Updated May 26, 2025

Efficient optimizers

Python 214 18 Updated Jun 15, 2025

[ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet

Python 85 11 Updated Jun 10, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,588 256 Updated Jun 18, 2025

(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition

Python 12 Updated Oct 22, 2024

Official Repository of the paper "Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models"

Python 2 Updated Apr 2, 2025

PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"

Python 171 12 Updated Apr 4, 2025

Ongoing research training transformer models at scale

Python 12,602 2,854 Updated Jun 19, 2025

[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Python 221 19 Updated May 3, 2025

[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

Python 172 11 Updated Mar 31, 2025

(ICLR 2025) Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation

Python 11 Updated Apr 29, 2025

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

C++ 22,875 5,749 Updated Jun 19, 2025

A PyTorch native platform for training generative AI models

Python 3,936 399 Updated Jun 19, 2025

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

451 13 Updated Jun 16, 2025

AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers

Python 47 4 Updated Oct 21, 2022

Mamba SSM architecture

Python 15,122 1,338 Updated May 25, 2025

Paper list for Efficient Reasoning.

500 19 Updated Jun 18, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 9,654 1,532 Updated Jun 19, 2025

반려동물 동반 시설 가이드 챗봇 개발

Python 4 4 Updated Apr 19, 2025

Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Python 163 7 Updated Jun 20, 2024

s1: Simple test-time scaling

Python 6,448 749 Updated May 19, 2025

A MAD laboratory to improve AI architecture designs 🧪

Python 120 13 Updated Dec 17, 2024

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,629 290 Updated Jun 18, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,824 277 Updated May 15, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,381 311 Updated May 13, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,373 2,237 Updated Feb 1, 2025
Next
0