While fact verification remains fundamental, explanation generation serves as a critical enabler for trustworthy fact-checking systems by producing interpretable rationales and facilitating comprehensive verification processes. However, current benchmarks exhibit critical limitations in three dimensions: (1) absence of explanatory annotations, (2) English-centric language bias, and (3) inadequate temporal relevance. To bridge these gaps, we present TrendFact, the first Chinese fact-checking benchmark incorporating structured natural language explanations. TrendFact comprises 7,643 carefully curated samples from trending social media content and professional fact-checking repositories, covering domains such as public health, political discourse, and economic claims. It supports various forms of reasoning, including numerical computation, logical reasoning, and common sense verification. The rigorous multistage construction process ensures high data quality and provides significant challenges. Furthermore, we propose the ECS to complement existing evaluation metrics. To establish effective baselines for TrendFact, we propose FactISR, a dual-component method integrating evidence triangulation and iterative self-reflection mechanism. Experimental results demonstrate that current leading reasoning models (e.g., DeepSeek-R1, o1) have significant limitations on TrendFact, underscoring the real-world challenges it presents. FactISR significantly enhances reasoning model performance, offering new insights for explainable and complex fact-checking.

View on arXiv

Comments on this paper