v1v3 (latest)

TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection

11 December 2025

Jian-Yu Jiang-Lin

Kang-Yang Huang

Ling Zou

Ling Lo

Sheng-Ping Yang

Yu-Wen Tseng

Kun-Hsiang Lin

Chia-Ling Chen

Yu-Ting Ta

Yan-Tsung Wang

Po-Ching Chen

Hongxia Xie

Hong-Han Shuai

Wen-Huang Cheng

HILM

ArXiv (abs)PDF HTML Github

Main:8 Pages

15 Figures

Bibliography:5 Pages

9 Tables

Appendix:13 Pages

Abstract

Advances in generative modeling have made it increasingly easy to fabricate realistic portrayals of individuals, creating serious risks for security, communication, and public trust. Detecting such person-driven manipulations requires systems that not only distinguish altered content from authentic media but also provide clear and reliable reasoning. In this paper, we introduce TriDF, a comprehensive benchmark for interpretable DeepFake detection. TriDF contains high-quality forgeries from advanced synthesis models, covering 16 DeepFake types across image, video, and audio modalities. The benchmark evaluates three key aspects: Perception, which measures the ability of a model to identify fine-grained manipulation artifacts using human-annotated evidence; Detection, which assesses classification performance across diverse forgery families and generators; and Hallucination, which quantifies the reliability of model-generated explanations. Experiments on state-of-the-art multimodal large language models show that accurate perception is essential for reliable detection, but hallucination can severely disrupt decision-making, revealing the interdependence of these three aspects. TriDF provides a unified framework for understanding the interaction between detection accuracy, evidence identification, and explanation reliability, offering a foundation for building trustworthy systems that address real-world synthetic media threats.

View on arXiv

Comments on this paper