8000 GitHub - Munia03/DermDiT: DermDiT: Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images (ISBI 2025)
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

DermDiT: Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images (ISBI 2025)

License

Notifications You must be signed in to change notification settings

Munia03/DermDiT

Repository files navigation

Acknowledgments

This work builds upon the repository DiT(https://github.com/facebookresearch/DiT) by Facebook Research. We extend their implementation for our DermDiT model. The setup and training workflow are also adapted from the original repository.

Setup

First, download and set up the repo:

git clone https://github.com/Munia03/DermDiT.git
cd DermDiT

We provide an environment.yml file that can be used to create a Conda environment.

conda env create -f environment.yml
conda activate DiT

Training DermDiT

We provide a training script for DiT in train_text_to_image.py. This script can be used to train text-conditional DermDiT model.

To launch DiT-L/4 (256x256) training with N GPUs on one node:

torchrun --nnodes=1 --nproc_per_node=N train_text_to_image.py --model DiT-L/4 --data-path /path/to/imagenet/train

Generating Dermsocopic images

We include a sample_text2img.py script which samples a large number of images from a DiT model in parallel. This script generates a folder of samples as well as a .npz file which can be directly used with ADM's TensorFlow evaluation suite to compute FID, Inception Score and other metrics. For example, to sample 50K images from a trained model over N GPUs, run:

torchrun --nnodes=1 --nproc_per_node=N sample_text2img.py --model DiT-L/4 --image-size 256 --num-fid-samples 50000 --ckpt /path/to/model.pt

BibTeX

@inproceedings{munia2025prompting,
  title={Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images},
  author={Munia, Nusrat and Imran, Abdullah Al Zubaer},
  booktitle={2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI)},
  pages={1--4},
  year={2025},
  organization={IEEE}
}

About

DermDiT: Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images (ISBI 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0