[2026.02] π GAPL has been accepted to CVPR 2026!
[2025.12] π Paper released on arXiv
Figure 1: Overview of our proposed Generator-Aware Prototype Learning (GAPL) framework.
In scaling up AIGI detections, generated images show heterogenity and cause previous AIGI detector fails to scale.
We learn a small set of forgery concepts as Generator-Aware Prototypes. And convert diverse generators into some certain prototypes.
We provide the minimum requirement package in requirements.txt, you could check whether your environment satisfy it or create an enviroment with the following command.
pip install -r requirements.txtTo evaluate performance of the proposed GAPL, You need to download the checkpoint from
- π Pretrained: Huggingface
To reproduce the results reported in our paper across various benchmarks:
- Modify the dataset paths in
benchmarks.pyto point to your local data. - Run the evaluation script:
bash scripts/val_bench.shYou can also run inference on a single image to detect whether it is Real or Fake.
python inference.py \
--model_path pretrained/checkpoint.pt \
--image_path assets/test_image.jpg \
--device cudaOutput Example:
[INFO] Loading model from pretrained/checkpoint.pt...
[RESULT] Image: assets/test_image.jpg
-> Prediction: Fake (AI-Generated)
-> Confidence: 99.8%
Before starting, please ensure you have prepared the required datasets:
- Stage 1 Data:
- Stage 2 Data:
- Community Forensics (Small Training Set): Please download it from Hugging Face.
- π Download Link: OwensLab/CommunityForensics-Small
In this stage, we train the backbone and learn the initial generator-aware prototypes.
Step 1: Configure Paths
Please open prototype_dataset.py and modify the dataset paths to match your local environment.
Step 2: Train Backbone Run the following script to start training:
bash scripts/stage1.shStep 3: Extract Prototypes After the backbone training converges, run the extraction script to generate the prototype vectors:
prototype/dream_prototype.py
β‘ Fast Track: We provide the pre-trained Stage 1 checkpoint and pre-extracted prototype vectors. You can skip this stage by downloading them from pretrained.
In the second stage, we fine-tune the model using the Community Forensics dataset to enhance robustness against diverse generators.
Run Training:
scripts/stage2.sh
If you find our work useful in your research, please consider citing:
@article{qin2025Scaling,
title={Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes},
author={Qin, Ziheng and Ji, Yuheng and Tao, Renshuai and Tian, Yuxuan and Liu, Yuyang and Wang, Yipu and Zheng, Xiaolong},
journal={arXiv preprint arXiv:2512.12982},
year={2025}
}Our code is developed based on the following excellent open-source repositories. We appreciate their excellent work and contributions to the community:
Community Forensics We leverage the dataset and borrow some code from this codebase.
