Stars
[CVPRW 2025] Code for SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
[ECCV 2024] RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
[SIGGRAPH 2025] LAM: Large Avatar Model for One-shot Animatable Gaussian Head
An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
[ICLR 2025 Spotlight] Official implementation of "Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts"
✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Democratizing Reinforcement Learning for LLMs
IDOL: Instant Photorealistic 3D Human Creation from a Single Image. An open-source project for fast, high-fidelity, and generalizable 3D human reconstruction from a single image.
Code and dataset for photorealistic Codec Avatars driven from audio
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
调用大模型已经是如今做 ai 项目习以为常的工作的,但是大模型的输出很多时候是不可控的,我们又需要使用大模型去做各种下游任务,实现可控可解析的输出。我们探索了一种和 python 开发可以紧密合作的开发方法。
[CVPR 2025] A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
OpenMMLab 3D Human Parametric Model Toolbox and Benchmark
FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation. (ICCV2023)
SkyReels-V2: Infinite-length Film Generative model
[ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
🎨 IMAGGarment-1: Fine-Grained Garment Generation with Controllable Structure, Color, and Logo. It supports precise and customizable garment synthesis guided by multi-conditions (e.g., sketch, colo…
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Official inference framework for 1-bit LLMs