-
Seoul National University
- Seoul, Korea
- jjihwan.github.io
- in/jjihwan
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
The devkit of the nuScenes dataset.
Interactive visualizations of the geometric intuition behind diffusion models.
[TMLR 2025🔥] A survey for the autoregressive models in vision.
Implementation of the paper "MaskBit: Embedding-free Image Generation from Bit Tokens"
Implementation of the proposed MaskBit from Bytedance AI
A high-throughput and memory-efficient inference and serving engine for LLMs
A TTS model capable of generating ultra-realistic dialogue in one pass.
Official Repository of Absolute Zero Reasoner
Official repository for LegoGPT, the first approach for generating physically stable LEGO brick models from text prompts.
[ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
F Lite is a 10B parameter diffusion model created by Freepik and Fal, trained exclusively on copyright-safe and SFW content.
MAGI-1: Autoregressive Video Generation at Scale
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Memory-Guided Diffusion for Expressive Talking Video Generation
Official Implementation of Video-T1: Test-Time Scaling for Video Generation
EDM2 and Autoguidance -- Official PyTorch implementation
Code for Paper 'Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach'
Pusa: Thousands Timesteps Video Diffusion Model
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training
Official implementation of HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
DCT (discrete cosine transform) functions for pytorch
Official repository for EXAONE 3.5 built by LG AI Research
Official repository for EXAONE Deep built by LG AI Research
Official PyTorch Implementation of "Optimal Stepsize for Diffusion Sampling".
Wan: Open and Advanced Large-Scale Video Generative Models
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding