Stars
Official inference repo for FLUX.1 models
Refine high-quality datasets and visual AI models
Quantized training of Stable Diffusion 3 Medium to significantly reduce memory usage.
OneTrainer is a one-stop solution for all your stable diffusion training needs.
sd3 dreambooth lora training book, adapted from the diffusers doc
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
Separate stems (vocals, bass, drums, other) from audio. Recombine, tempo match, slice/crop audio
A playbook for systematically maximizing the performance of deep learning models.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
[TPAMI 2025π₯] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
[ICCV 2023 Oral] Text-to-Image Diffusion Models are Zero-Shot Video Generators
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Object detection, 3D detection, and pose estimation using center point detection:
Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023
βοΈ Web-based image segmentation tool for object detection, localization, and keypoints
Food detector using YOLOv3 and custom ResNet-50 written in MXNet/Python
πππ Food analysis baseline with Theseus. Integrate object detection, image classification and multi-class semantic segmentation πππ
[NeurIPS 2024] Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis
QualityScaler - image/video AI upscaler app
Select one or some of images from a batch
Create π₯ videos with Stable Diffusion by exploring the latent space and morphing between text prompts
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, thβ¦