Image Generation using Stable Diffusion

Overview

This project aims to replicate Stable Diffusion architecture and few of the improvements that stablize the generation process. Summary of what implemented in this repository:

Stable Diffusion Architecure
- VAE (Variational Auto Encoder)
- U-Net
- Text Encoder from CLIP ViT-L/14 for SD 1.5, OpenCLIP ViT-H for SD 2.1
DDPM (Denoising Diffusion Probabilistic Models)
DDIM (Denoising Diffusion Implicit Models)
V-Prediction
One Step Diffusion using SwiftBrush
Beta Cosine based scheduler
CFG (Classifer free Guidance)
LoRA (Low Rank Adaptation)
Dreambooth
EMA (Exponentially mean averages)
Gradient Checkpointing
Gradient Accumulation
Flash Attention
Demo App using Gradio

Workflow

Setup and Run

Create virtual environment from env.yaml

conda create --prefix ./.env --file env.yaml

Activate virtual environment

conda activate ./.env

Generate image

3.1. Using gradio app
```
python3 app.py
```
Setting options:
- CFG Scale: Set classifer-free guidance scale (larger value tends to focus on conditional prompt, smaller value tends to focus on unconditional prompt)
- Strength: Set the strength to generate the image (Given image from the user, the smaller value tends to generate an image closer to the original one)
- Generation Steps: Step to generate image
- Sampling method: 2 options available: DDPM and DDIM
- Use cosine-based beta scheduler: Using cosine function to generate beta values used for adding and remove noise from the image.

3.2. Using command line

python3 inference.py
-h, --help            show this help message and exit
--model_path          Model path
--tokenizer_dir       Tokenizer dir
--device              Choose device to train
--img_size            Image size
--img_path            Image path
--prompt              Input prompt
--uncond_prompt       Unconditional prompt
--n_samples           Number of generated images
--lora_ckpt           Option to use lora checkpoint
--do_cfg, --no-do_cfg
                      Activate CFG
--cfg_scale           Set classifer-free guidance scale 
                      (larger value tends to focus on conditional prompt, smaller value tends to focus on unconditional prompt)
--strength            Set the strength to generate the image (Given image
                      from the user, the smaller value tends to generate an image closer to the original one)
--num_inference_step  Value: [0-999]
                      Step to generate image
--sampler             Sampling method (2 options available): DDPM or DDIM
--use_cosine_schedule, --no-use_cosine_schedule
                      Activate using cosine function to generate beta values used for adding and remove noise from the image.
--batch_size          Number of images generated in the same
                      time.
--seed                Seed for reproducibility.
--one_step            Specify to use one step generation.
--sd_version          Specify Stable Diffusion model version.

References

Jonathan Ho et al. “Denoising diffusion probabilistic models.” arxiv Preprint arxiv:2006.11239 (2020)
Jiaming Song et al. “Denoising diffusion implicit models.” arxiv Preprint arxiv:2010.02502 (2020)
Alex Nichol & Prafulla Dhariwal. “Improved denoising diffusion probabilistic models” arxiv Preprint arxiv:2102.09672 (2021).
Robin R. and et al, "High-Resolution Image Synthesis with Latent Diffusion Models" arxiv Preprint arXiv:2112.10752v2 (2021).
Jonathan Ho & Tim Salimans. “Classifier-Free Diffusion Guidance.” NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. (2022)
Tim Salimans & Jonathan Ho. "Progressive Distillation for Fast Sampling of Diffusion Models" (2022)
Thuan Hoang Nguyen & Anh Tran. "SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation" arxiv Preprint arXiv:2312.05239v7 (2024).

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
demo		demo
imgs		imgs
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
env.yaml		env.yaml
evaluation.py		evaluation.py
inference.py		inference.py
inference.sh		inference.sh
integration_test.py		integration_test.py
requirements.txt		requirements.txt
train.sh		train.sh
train_lora_dreambooth.py		train_lora_dreambooth.py
unit_test.py		unit_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Generation using Stable Diffusion

Overview

Workflow

Setup and Run

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

dnnhhuy/stable-diffusion-pytorch

Folders and files

Latest commit

History

Repository files navigation

Image Generation using Stable Diffusion

Overview

Workflow

Setup and Run

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages