ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Anuj Kumar
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 615 papers shown
AlignCheck: a Semantic Open-Domain Metric for Factual Consistency Assessment
AlignCheck: a Semantic Open-Domain Metric for Factual Consistency Assessment
Ahmad Aghaebrahimian
HILM
156
0
0
03 Dec 2025
Towards Unification of Hallucination Detection and Fact Verification for Large Language Models
Towards Unification of Hallucination Detection and Fact Verification for Large Language Models
Weihang Su
Jianming Long
Changyue Wang
Shiyu Lin
Jingyan Xu
Ziyi Ye
Qingyao Ai
Yiqun Liu
HILM
112
0
0
02 Dec 2025
Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92%
Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92%
Mainak Singha
HILM
268
0
0
02 Dec 2025
BHRAM-IL: A Benchmark for Hallucination Recognition and Assessment in Multiple Indian Languages
Hrishikesh Terdalkar
Kirtan Bhojani
Aryan Dongare
Omm Aditya Behera
HILMVLM
141
0
0
01 Dec 2025
TrackList: Tracing Back Query Linguistic Diversity for Head and Tail Knowledge in Open Large Language Models
TrackList: Tracing Back Query Linguistic Diversity for Head and Tail Knowledge in Open Large Language Models
Ioana Buhnila
Aman Sinha
Mathieu Constant
231
0
0
26 Nov 2025
MUCH: A Multilingual Claim Hallucination Benchmark
MUCH: A Multilingual Claim Hallucination Benchmark
Jérémie Dentan
Alexi Canesse
Davide Buscaldi
A. Shabou
Sonia Vanier
HILM
215
0
0
21 Nov 2025
Beyond Component Strength: Synergistic Integration and Adaptive Calibration in Multi-Agent RAG Systems
Beyond Component Strength: Synergistic Integration and Adaptive Calibration in Multi-Agent RAG Systems
Jithin Krishnan
60
0
0
21 Nov 2025
The Oracle and The Prism: A Decoupled and Efficient Framework for Generative Recommendation Explanation
The Oracle and The Prism: A Decoupled and Efficient Framework for Generative Recommendation Explanation
Jiaheng Zhang
Daqiang Zhang
233
0
0
20 Nov 2025
ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions
ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions
Xingwei He
Qianru Zhang
Pengfei Chen
Guanhua Chen
Linlin Yu
Yuan Yuan
Siu-Ming Yiu
217
0
0
18 Nov 2025
AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models
AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models
Declan Jackson
William Keating
George Cameron
Micah Hill-Smith
HILMRALMELM
735
0
0
17 Nov 2025
Assessing Automated Fact-Checking for Medical LLM Responses with Knowledge Graphs
Assessing Automated Fact-Checking for Medical LLM Responses with Knowledge Graphs
Shasha Zhou
Mingyu Huang
Jack Cole
Charles Britton
Ming Yin
Jan Wolber
Ke Li
86
1
0
16 Nov 2025
QA-Noun: Representing Nominal Semantics via Natural Language Question-Answer Pairs
QA-Noun: Representing Nominal Semantics via Natural Language Question-Answer Pairs
Maria Tseytlin
Paul Roit
Omri Abend
Ido Dagan
Ayal Klein
52
0
0
16 Nov 2025
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Raavi Gupta
Pranav Hari Panicker
S. Bhatia
Ganesh Ramakrishnan
HILM
136
2
0
15 Nov 2025
Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights
Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights
Hyunjae Kim
Jiwoong Sohn
Aidan Gilson
Nicholas Cochran-Caggiano
Serina S Applebaum
...
James Zou
Andrew Taylor
Arman Cohan
Hua Xu
Qingyu Chen
RALMLM&MA
335
3
0
10 Nov 2025
Evaluation of retrieval-based QA on QUEST-LOFT
Evaluation of retrieval-based QA on QUEST-LOFT
Nathan Scales
Nathanael Scharli
Olivier Bousquet
RALM
376
0
0
08 Nov 2025
TSVer: A Benchmark for Fact Verification Against Time-Series Evidence
TSVer: A Benchmark for Fact Verification Against Time-Series Evidence
Marek Strong
Andreas Vlachos
AI4TS
144
0
0
02 Nov 2025
VISTA Score: Verification In Sequential Turn-based Assessment
VISTA Score: Verification In Sequential Turn-based Assessment
A. Lewis
Andrew Perrault
Eric Fosler-Lussier
Michael White
HILM
284
0
0
30 Oct 2025
RCScore: Quantifying Response Consistency in Large Language Models
RCScore: Quantifying Response Consistency in Large Language Models
Dongjun Jang
Youngchae Ahn
Hyopil Shin
132
0
0
30 Oct 2025
CLINB: A Climate Intelligence Benchmark for Foundational Models
CLINB: A Climate Intelligence Benchmark for Foundational Models
Michelle Chen Huebscher
Katharine Mach
Aleksandar Stanić
Markus Leippold
Ben Gaiarin
...
Massimiliano Ciaramita
Joeri Rogelj
Christian Buck
Lierni Sestorain Saralegui
Reto Knutti
HILMELM
311
0
0
29 Oct 2025
Evidence-Bound Autonomous Research (EviBound): A Governance Framework for Eliminating False Claims
Evidence-Bound Autonomous Research (EviBound): A Governance Framework for Eliminating False Claims
Ruiying Chen
96
0
0
28 Oct 2025
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
Yucheng Ning
Xixun Lin
Fang Fang
Yanan Cao
HILM
305
0
0
27 Oct 2025
SafetyPairs: Isolating Safety Critical Image Features with Counterfactual Image Generation
SafetyPairs: Isolating Safety Critical Image Features with Counterfactual Image Generation
Alec Helbling
Shruti Palaskar
Kundan Krishna
Polo Chau
Leon A Gatys
Joseph Y Cheng
EGVM
194
1
0
24 Oct 2025
JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
F. Xu
Huixuan Zhang
Zhenliang Zhang
Jiahao Wang
Xiaojun Wan
HILM
196
0
0
22 Oct 2025
Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
Shuxin Lin
Dhaval Patel
Christodoulos Constantinides
LRM
104
1
0
21 Oct 2025
Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations
Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations
Tong Chen
Akari Asai
Luke Zettlemoyer
Hannaneh Hajishirzi
Faeze Brahman
OffRLHILMLRM
189
0
0
20 Oct 2025
ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models
ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models
Mingda Li
Xinyu Li
Weinan Zhang
Longxuan Ma
136
0
0
15 Oct 2025
The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers
The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers
Saad Obaid ul Islam
Anne Lauscher
Goran Glavaš
HILM
212
0
0
13 Oct 2025
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Yingjia Wan
Haochen Tan
Xiao Zhu
Xinyu Zhou
Z. Li
...
Jiaqi Zeng
Yi Xu
Jianqiao Lu
Yinhong Liu
Zhijiang Guo
HILMOffRL
559
0
0
13 Oct 2025
Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation
Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation
Xiangxu Zhang
Lei Li
Yanyun Zhou
Xiao Zhou
Y. Zhang
Xian Wu
LM&MAELM
181
0
0
10 Oct 2025
Large Language Models Do NOT Really Know What They Don't Know
Large Language Models Do NOT Really Know What They Don't Know
C. Cheang
Hou Pong Chan
Wenxuan Zhang
Yang Deng
HILM
153
0
0
10 Oct 2025
Automated Refinement of Essay Scoring Rubrics for Language Models via Reflect-and-Revise
Automated Refinement of Essay Scoring Rubrics for Language Models via Reflect-and-Revise
Keno Harada
Lui Yoshida
Takeshi Kojima
Yusuke Iwasawa
Yutaka Matsuo
106
0
0
10 Oct 2025
Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation
Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation
Adam Dejl
James Barry
Alessandra Pascale
Javier Carnerero-Cano
HILMELM
120
0
0
09 Oct 2025
PrismGS: Physically-Grounded Anti-Aliasing for High-Fidelity Large-Scale 3D Gaussian Splatting
PrismGS: Physically-Grounded Anti-Aliasing for High-Fidelity Large-Scale 3D Gaussian Splatting
Houqiang Zhong
Zhenglong Wu
Sihua Fu
Zihan Zheng
Xin Jin
X. Zhang
Li Song
Q. Hu
3DGS
112
5
0
09 Oct 2025
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
Joseph Enguehard
Morgane Van Ermengem
Kate Atkinson
Sujeong Cha
Arijit Ghosh Chowdhury
...
Jeremy Roghair
Hannah R Marlowe
Carina Suzana Negreanu
Kitty Boxall
Diana Mincu
AILawELM
160
0
0
08 Oct 2025
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Gagan Bhatia
Somayajulu G Sripada
Kevin Allan
Jacobo Azcona
HILMLRM
273
1
0
07 Oct 2025
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
Elisei Rykov
Kseniia Petrushina
Maksim Savkin
Valerii Olisov
Artem Vazhentsev
Kseniia Titova
Ilseyar Alimova
Vasily Konovalov
Julia Belikova
HILM
181
2
0
06 Oct 2025
The Geometry of Truth: Layer-wise Semantic Dynamics for Hallucination Detection in Large Language Models
The Geometry of Truth: Layer-wise Semantic Dynamics for Hallucination Detection in Large Language Models
Amir Hameed Mir
HILM
148
0
0
06 Oct 2025
Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
Sayan Ghosh
Shahzaib Saqib Warraich
Dhruv Tarsadiya
Gregory Yauney
Swabha Swayamdipta
116
0
0
03 Oct 2025
Reward Models are Metrics in a Trench Coat
Reward Models are Metrics in a Trench Coat
Sebastian Gehrmann
144
0
0
03 Oct 2025
Knowledge-Graph Based RAG System Evaluation Framework
Knowledge-Graph Based RAG System Evaluation Framework
Sicheng Dong
Vahid Zolfaghari
Nenad Petrovic
Alois C. Knoll
139
0
0
02 Oct 2025
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
Shenxu Chang
Junchi Yu
Weixing Wang
Yongqiang Chen
Jialin Yu
Philip Torr
Jindong Gu
HILM
153
0
0
30 Sep 2025
KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning
KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning
Xilin Dang
Kexin Chen
Xiaorui Su
Ayush Noori
Inaki Arango
Lucas Vittor
Xinyi Long
Yuyang Du
Marinka Zitnik
Pheng-Ann Heng
116
1
0
29 Sep 2025
Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality
Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality
Junliang Li
Yucheng Wang
Yan Chen
Yu Ran
Ruiqing Zhang
Jing Liu
H. Wu
Haifeng Wang
OffRLHILM
137
0
0
28 Sep 2025
EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos
EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos
Sourjyadip Ray
Shubham Sharma
Somak Aditya
Pawan Goyal
AI4Ed
220
0
0
28 Sep 2025
Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models
Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models
Sina J. Semnani
Jirayu Burapacheep
Arpandeep Khatua
Thanawan Atchariyachanvanit
Zheng Wang
M. Lam
KELM
124
1
0
27 Sep 2025
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
Yehonatan Peisakhovsky
Zorik Gekhman
Y. Mass
Liat Ein-Dor
Roi Reichart
HILM
152
1
0
26 Sep 2025
Comparative Personalization for Multi-document Summarization
Comparative Personalization for Multi-document Summarization
Haoyuan Li
Snigdha Chaturvedi
108
0
0
25 Sep 2025
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Guo Chen
Qiuyuan Li
Qiuxian Li
Hongliang Dai
Xiang Chen
Piji Li
3DVHILM
157
0
0
25 Sep 2025
Memory in Large Language Models: Mechanisms, Evaluation and Evolution
Memory in Large Language Models: Mechanisms, Evaluation and Evolution
D. Zhang
Wendong Li
Kani Song
Jiaye Lu
Gang Li
Liuchun Yang
Sheng Li
KELM
208
1
0
23 Sep 2025
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions
Xixun Lin
Yucheng Ning
Jingwen Zhang
Yan Dong
Y. Liu
...
Bin Wang
Yanan Cao
Kai-xiang Chen
Songlin Hu
Li Guo
LLMAGLRM
335
4
0
23 Sep 2025
1234...111213
Next