The rapid advancement of generative image technology has introduced significant security concerns, particularly in the domain of face generation detection. This paper investigates the vulnerabilities of current AI-generated face detection systems. Our study reveals that while existing detection methods often achieve high accuracy under standard conditions, they exhibit limited robustness against adversarial attacks. To address these challenges, we propose an approach that integrates adversarial training to mitigate the impact of adversarial examples. Furthermore, we utilize diffusion inversion and reconstruction to further enhance detection robustness. Experimental results demonstrate that minor adversarial perturbations can easily bypass existing detection systems, but our method significantly improves the robustness of these systems. Additionally, we provide an in-depth analysis of adversarial and benign examples, offering insights into the intrinsic characteristics of AI-generated content. All associated code will be made publicly available in a dedicated repository to facilitate further research and verification.
View on arXiv@article{haoxuan2025_2505.03435, title={ Robustness in AI-Generated Detection: Enhancing Resistance to Adversarial Attacks }, author={ Sun Haoxuan and Hong Yan and Zhan Jiahui and Chen Haoxing and Lan Jun and Zhu Huijia and Wang Weiqiang and Zhang Liqing and Zhang Jianfu }, journal={arXiv preprint arXiv:2505.03435}, year={ 2025 } }