Stars
Updated AssetStudio, supports GI 5.5+, HSR 3.2+, ZZZ 1.6+, with improvements and new features (*゚∀゚*)
aelurum / AssetStudio
Forked from Perfare/AssetStudioAssetStudioMod - modified version of Perfare's AssetStudio, mainly focused on UI optimization and some functionality enhancements.
MoviiGen 1.1: Towards Cinematic-Quality Video Generative Models
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Ultralytics YOLO with Additional Knowledge Distillation Capability
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enou…
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
A TTS model capable of generating ultra-realistic dialogue in one pass.
A set of nodes to edit videos using the Hunyuan Video model
OmniSVG is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from simple icons to in…
🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
[SIGGRAPH 2025] LAM: Large Avatar Model for One-shot Animatable Gaussian Head
SkyReels-A2: Compose anything in video diffusion transformers
Lumina-mGPT 2.0: Stand-alone Autoregressive Image Modeling
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"
[ICLR 2025] CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) …
real time face swap and one-click video deepfake with only a single image
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)