Stars
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Command-line program to download image galleries and collections from several image hosting sites
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
harisreedhar / DeOldify
Forked from jantic/DeOldifyA Deep Learning based project for colorizing and restoring old images (and video!)
Scrapes instagram based on multiple profiles, creates folders of each persons face recognized so can be used for training models
Civilization 6 mod - UI enhancements, reduce clicks and manage your empire faster!
Civilization VI Detailed Map Tacks mod.
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs
This is the official implementation of "Blind Image Restoration via Fast Diffusion Inversion"
[ICLR 2025] Official implementation of MotionClone: Training-Free Motion Cloning for Controllable Video Generation
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
[CSUR] A Survey on Video Diffusion Models
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
LayerDiffuse in pure diffusers without any GUI
Code for SCIS-2025 Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
ONNX-Powered Inference for State-of-the-Art Face Upscalers
This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.
Zero-Shot Audio-Visual Compound Expression Recognition Method based on Emotion Probability Fusion
The Data and Code of Prompt2Sign: A Comprehensive Multilingual Sign Language Dataset.