Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach

The Broader Region Generated (BR-Gen) dataset was proposed in the ArXiv preprint paper "Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach".

Dataset(BR-Gen)

This dataset contains150k localized generated images, forged by traditional inpainting methods (MAT, LaMa) and text-guided inpainting methods (SDXL, BrushNet, PowerPaint). We provided the Region Masks and Localized Generated Images.

Visual Cases

Dataset specifications

How we created 150k localized generated images using various open-source models. We used 2 types of masks, and 5 types of inpainting methods to generated these images. Not seen in the diagram: each real image will correspond to 2 masks and 10 localized generated images.

Generated types
# masks	2 (Stuff, Background)
# Inpainting Methods	5 (LaMa, MAT, SDXL, BrushNet, PowerPaint)
Total # generated iamges per real image	2 * 5 = 10

Dataset sizes	Training	Testing	Validation	Total
# real images	12,000	1,500	1,500	15,000
# localized generated images	120,000	15,000	15,000	150,000

Note, in the process of training and testing, in order to prevent the impact o category imbalance, we sample the generated images to keep the number of real samples the same.

Download

The BR-Gen dataset can be downloaded through Google Drive and Baidu Netdisk (Password: cclp). About stuff categories and thing categories, you can consult COCO_stuff for more details. If you have any questions, please send an email to lvpancai@stu.xmu.edu.cn.

Considering copyright issues, the BR-Gen dataset only provides Region Masks and Forged Images. The original images were collected from datasets such as COCO, ImageNet, and Places. as detailed in Section 3.1 Real Image Collection of the paper.

Dataset	Download URL
COCO2017_train	http://images.cocodataset.org/zips/train2017.zip
ImageNet	https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar
Places	Places2: A Large-Scale Database for Scene Understanding

However, we have provided the file name of the real image used in the dataset. You can extract the real image data used in this dataset from the original real data according to "RealImage/xxxxx/xxxxx_image_list.txt" in the path.

License

The BR-Gen dataset is released only for academic research. Researchers from educational institutes are allowed to use this database freely for noncommercial purposes.

Noise-guided Foregery Amplification Vision Transformer(NFA-ViT)

To address the BR-Gen challenge and enhance performance of local AIGC detection, we introduce NFA-ViT, a noise-guided forgery amplification transformer that leverages a dual-branch architecture to diffuse forgery cues into real regions through modulated self-attention, significantly improving the detectability of small or spatially subtle forgeries.

For dataset and model utilization, we recommend using IMDLBenCo, which offers many methods. And you can use this codebase to load the data and test model.

Installation

conda create -n nfa_vit python=3.9 -y
conda activate nfa_vit
pip install -r requirements.txt

Train

train.sh

Test

test.sh

Citation

If you find BR-Gen and NFA-ViT are useful for your research and applications, please cite using this BibTeX:

@article{cai2025zooming,
  title={Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach},
  author={Cai, Lvpan and Wang, Haowei and Ji, Jiayi and ZhouMen, YanShu and Ma, Yiwei and Sun, Xiaoshuai and Cao, Liujuan and Ji, Rongrong},
  journal={arXiv preprint arXiv:2504.11922},
  year={2025}
}

References & Acknowledgements

We sincerely thank IMDLBenCo for their exploration and support.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
figs		figs
model_zoo		model_zoo
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
test.sh		test.sh
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach

Dataset(BR-Gen)

Visual Cases

Dataset specifications

Download

License

Noise-guided Foregery Amplification Vision Transformer(NFA-ViT)

Installation

Train

Test

Citation

References & Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

clpbc/BR-Gen

Folders and files

Latest commit

History

Repository files navigation

Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach

Dataset(BR-Gen)

Visual Cases

Dataset specifications

Download

License

Noise-guided Foregery Amplification Vision Transformer(NFA-ViT)

Installation

Train

Test

Citation

References & Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages