Stars
Awesome Knowledge Distillation
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Wan: Open and Advanced Large-Scale Video Generative Models
Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
[ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
A latent text-to-image diffusion model
Official repository of In-Context LoRA for Diffusion Transformers
The official repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving"
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021
Official implementation of Magic Clothing: Controllable Garment-Driven Image Synthesis
Official Code for Stable Cascade
StoryMaker: Towards consistent characters in text-to-image generation
(SIGGRAPH Asia 2024) This is the official PyTorch implementation of SIGGRAPH Asia 2024 paper: DrawingSpinUp: 3D Animation from Single Character Drawings
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
CSGO: Content-Style Composition in Text-to-Image Generation 🔥
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high …
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
OneTrainer is a one-stop solution for all your stable diffusion training needs.
[ICLR 2025] CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) …
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Using Low-rank adaptation to quickly fine-tune diffusion models.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"