v1v2v3 (latest)

FEVER: a large-scale dataset for Fact Extraction and VERification

North American Chapter of the Association for Computational Linguistics (NAACL), 2018

14 March 2018

James Thorne

Andreas Vlachos

Christos Christodoulopoulos

Arpit Mittal

HILM

ArXiv (abs)PDF HTML

Papers citing "FEVER: a large-scale dataset for Fact Extraction and VERification"

50 / 1,133 papers shown

AlignCheck: a Semantic Open-Domain Metric for Factual Consistency Assessment

Ahmad Aghaebrahimian

HILM

163

03 Dec 2025

Towards Unification of Hallucination Detection and Fact Verification for Large Language Models

119

02 Dec 2025

HealthContradict: Evaluating Biomedical Knowledge Conflicts in Language Models

147

02 Dec 2025

Trification: A Comprehensive Tree-based Strategy Planner and Structural Verification for Fact-Checking

29 Nov 2025

Can LLMs extract human-like fine-grained evidence for evidence-based fact-checking?

Antonín Jarolím

Martin Fajčík

Lucia Makaiová

136

26 Nov 2025

Large Language Models Require Curated Context for Reliable Political Fact-Checking -- Even with Reasoning and Web Search

228

24 Nov 2025

Learning to Compress: Unlocking the Potential of Large Language Models for Text Representation

194

21 Nov 2025

ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions

217

18 Nov 2025

Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts

142

15 Nov 2025

Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks

Yauhen Babakhin

Radek Osmulski

Ronay Ak

Gabriel de Souza P. Moreira

135

10 Nov 2025

Wikipedia-based Datasets in Russian Information Retrieval Benchmark RusBEIR

Grigory Kovalev

Natalia Loukachevitch

M. Tikhomirov

Olga Babina

Pavel Mamaev

106

07 Nov 2025

Hybrid Fact-Checking that Integrates Knowledge Graphs, Large Language Models, and Search-Based Retrieval Agents Improves Interpretable Claim Verification

137

05 Nov 2025

TSVer: A Benchmark for Fact Verification Against Time-Series Evidence

Marek Strong

Andreas Vlachos

AI4TS

146

02 Nov 2025

RzenEmbed: Towards Comprehensive Multimodal Retrieval

133

31 Oct 2025

CausalGuard: A Smart System for Detecting and Preventing False Information in Large Language Models

Piyushkumar Patel

HILM LRM

100

30 Oct 2025

Layer of Truth: Probing Belief Shifts under Continual Pre-Training Poisoning

365

29 Oct 2025

HACK: Hallucinations Along Certainty and Knowledge Axes

190

28 Oct 2025

MERGE: Minimal Expression-Replacement GEneralization Test for Natural Language Inference

Mădălina Zgreabăn

Tejaswini Deoskar

Lasha Abzianidze

118

28 Oct 2025

ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents

27 Oct 2025

Multi-Modal Fact-Verification Framework for Reducing Hallucinations in Large Language Models

Piyushkumar Patel

HILM

176

26 Oct 2025

A Comprehensive Dataset for Human vs. AI Generated Text Detection

...

520

26 Oct 2025

A Benchmark for Open-Domain Numerical Fact-Checking Enhanced by Claim Decomposition

174

24 Oct 2025

Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples

118

23 Oct 2025

Rethinking On-policy Optimization for Query Augmentation

183

20 Oct 2025

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

564

19 Oct 2025

SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models

...

184

19 Oct 2025

Stable but Miscalibrated: A Kantian View on Overconfidence from Filters to Large Language Models

Akira Okutomi

LRM

208

16 Oct 2025

Retrofitting Small Multilingual Models for Retrieval: Matching 7B Performance with 300M Parameters

16 Oct 2025

Putting on the Thinking Hats: A Survey on Chain of Thought Fine-tuning from the Perspective of Human Reasoning Mechanism

226

15 Oct 2025

When Embedding Models Meet: Procrustes Bounds and Applications

Lucas Maystre

Alvaro Ortega Gonzalez

167

15 Oct 2025

The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation

101

14 Oct 2025

Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation

200

14 Oct 2025

LLM-Specific Utility: A New Perspective for Retrieval-Augmented Generation

145

13 Oct 2025

Attacks by Content: Automated Fact-checking is an AI Security Issue

Michael Schlichtkrull

AAML

116

13 Oct 2025

Discrepancy Detection at the Data Level: Toward Consistent Multilingual Question Answering

Lorena Calvo-Bartolomé

Valérie Aldana

Karla Cantarero

Alonso Madroñal de Mesa

Jerónimo Arenas-García

Jordan L. Boyd-Graber

HILM

190

13 Oct 2025

FactAppeal: Identifying Epistemic Factual Appeals in News Media

133

12 Oct 2025

You're Not Gonna Believe This: A Computational Analysis of Factual Appeals and Sourcing in Partisan News

Guy Mor-Lan

Tamir Sheafer

Shaul R. Shenhav

12 Oct 2025

ADMIT: Few-shot Knowledge Poisoning Attacks on RAG-based Fact Checking

154

11 Oct 2025

Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

...

157

09 Oct 2025

Text2Stories: Evaluating the Alignment Between Stakeholder Interviews and Generated User Stories

Francesco Dente

Fabiano Dalpiaz

Paolo Papotti

08 Oct 2025

GRACE: Generative Representation Learning via Contrastive Policy Optimization

06 Oct 2025

Equipping Retrieval-Augmented Large Language Models with Document Structure Awareness

133

05 Oct 2025

Contrastive Retrieval Heads Improve Attention-Based Re-Ranking

129

02 Oct 2025

F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data

144

02 Oct 2025

Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

189

02 Oct 2025

Milco: Learned Sparse Retrieval Across Languages via a Multilingual Connector

127

01 Oct 2025

MuPlon: Multi-Path Causal Optimization for Claim Verification through Controlling Confounding

147

30 Sep 2025

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

389

29 Sep 2025

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

148

29 Sep 2025

Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models

Sina J. Semnani

Jirayu Burapacheep

Arpandeep Khatua

Thanawan Atchariyachanvanit

Zheng Wang

M. Lam

KELM

129

27 Sep 2025