Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models

28 March 2025

Abstract

Invisible watermarking is critical for content provenance and accountability in Generative AI. Although commercial companies have increasingly committed to using watermarks, the robustness of existing watermarking schemes against forgery attacks is understudied. This paper proposes DiffForge, the first watermark forgery framework capable of forging imperceptible watermarks under a no-box setting. We estimate the watermark distribution using an unconditional diffusion model and introduce shallow inversion to inject the watermark into a non-watermarked image seamlessly. This approach facilitates watermark injection while preserving image quality by adaptively selecting the depth of inversion steps, leveraging our key insight that watermarks degrade with added noise during the early diffusion phases. Comprehensive evaluations show that DiffForge deceives open-source watermark detectors with a 96.38% success rate and misleads a commercial watermark system with over 97% success rate, achieving high confidence.1 This work reveals fundamental security limitations in current watermarking paradigms.

View on arXiv

@article{dong2025_2503.22330,
  title={ Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models },
  author={ Ziping Dong and Chao Shuai and Zhongjie Ba and Peng Cheng and Zhan Qin and Qinglong Wang and Kui Ren },
  journal={arXiv preprint arXiv:2503.22330},
  year={ 2025 }
}

Comments on this paper