ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Anuj Kumar
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 615 papers shown
AlignCheck: a Semantic Open-Domain Metric for Factual Consistency Assessment
AlignCheck: a Semantic Open-Domain Metric for Factual Consistency Assessment
Ahmad Aghaebrahimian
HILM
156
0
0
03 Dec 2025
Towards Unification of Hallucination Detection and Fact Verification for Large Language Models
Towards Unification of Hallucination Detection and Fact Verification for Large Language Models
Weihang Su
Jianming Long
Changyue Wang
Shiyu Lin
Jingyan Xu
Ziyi Ye
Qingyao Ai
Yiqun Liu
HILM
119
0
0
02 Dec 2025
Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92%
Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92%
Mainak Singha
HILM
274
0
0
02 Dec 2025
BHRAM-IL: A Benchmark for Hallucination Recognition and Assessment in Multiple Indian Languages
Hrishikesh Terdalkar
Kirtan Bhojani
Aryan Dongare
Omm Aditya Behera
HILMVLM
141
0
0
01 Dec 2025
TrackList: Tracing Back Query Linguistic Diversity for Head and Tail Knowledge in Open Large Language Models
TrackList: Tracing Back Query Linguistic Diversity for Head and Tail Knowledge in Open Large Language Models
Ioana Buhnila
Aman Sinha
Mathieu Constant
242
0
0
26 Nov 2025
MUCH: A Multilingual Claim Hallucination Benchmark
MUCH: A Multilingual Claim Hallucination Benchmark
Jérémie Dentan
Alexi Canesse
Davide Buscaldi
A. Shabou
Sonia Vanier
HILM
220
0
0
21 Nov 2025
Beyond Component Strength: Synergistic Integration and Adaptive Calibration in Multi-Agent RAG Systems
Beyond Component Strength: Synergistic Integration and Adaptive Calibration in Multi-Agent RAG Systems
Jithin Krishnan
60
0
0
21 Nov 2025
The Oracle and The Prism: A Decoupled and Efficient Framework for Generative Recommendation Explanation
The Oracle and The Prism: A Decoupled and Efficient Framework for Generative Recommendation Explanation
Jiaheng Zhang
Daqiang Zhang
239
0
0
20 Nov 2025
ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions
ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions
Xingwei He
Qianru Zhang
Pengfei Chen
Guanhua Chen
Linlin Yu
Yuan Yuan
Siu-Ming Yiu
217
0
0
18 Nov 2025
AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models
AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models
Declan Jackson
William Keating
George Cameron
Micah Hill-Smith
HILMRALMELM
737
0
0
17 Nov 2025
Assessing Automated Fact-Checking for Medical LLM Responses with Knowledge Graphs
Assessing Automated Fact-Checking for Medical LLM Responses with Knowledge Graphs
Shasha Zhou
Mingyu Huang
Jack Cole
Charles Britton
Ming Yin
Jan Wolber
Ke Li
89
1
0
16 Nov 2025
QA-Noun: Representing Nominal Semantics via Natural Language Question-Answer Pairs
QA-Noun: Representing Nominal Semantics via Natural Language Question-Answer Pairs
Maria Tseytlin
Paul Roit
Omri Abend
Ido Dagan
Ayal Klein
54
0
0
16 Nov 2025
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Consistency Is the Key: Detecting Hallucinations in LLM Generated Text By Checking Inconsistencies About Key Facts
Raavi Gupta
Pranav Hari Panicker
S. Bhatia
Ganesh Ramakrishnan
HILM
142
2
0
15 Nov 2025
Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights
Rethinking Retrieval-Augmented Generation for Medicine: A Large-Scale, Systematic Expert Evaluation and Practical Insights
Hyunjae Kim
Jiwoong Sohn
Aidan Gilson
Nicholas Cochran-Caggiano
Serina S Applebaum
...
James Zou
Andrew Taylor
Arman Cohan
Hua Xu
Qingyu Chen
RALMLM&MA
344
3
0
10 Nov 2025
Evaluation of retrieval-based QA on QUEST-LOFT
Evaluation of retrieval-based QA on QUEST-LOFT
Nathan Scales
Nathanael Scharli
Olivier Bousquet
RALM
376
0
0
08 Nov 2025
TSVer: A Benchmark for Fact Verification Against Time-Series Evidence
TSVer: A Benchmark for Fact Verification Against Time-Series Evidence
Marek Strong
Andreas Vlachos
AI4TS
144
0
0
02 Nov 2025
VISTA Score: Verification In Sequential Turn-based Assessment
VISTA Score: Verification In Sequential Turn-based Assessment
A. Lewis
Andrew Perrault
Eric Fosler-Lussier
Michael White
HILM
290
0
0
30 Oct 2025
RCScore: Quantifying Response Consistency in Large Language Models
RCScore: Quantifying Response Consistency in Large Language Models
Dongjun Jang
Youngchae Ahn
Hyopil Shin
140
0
0
30 Oct 2025
CLINB: A Climate Intelligence Benchmark for Foundational Models
CLINB: A Climate Intelligence Benchmark for Foundational Models
Michelle Chen Huebscher
Katharine Mach
Aleksandar Stanić
Markus Leippold
Ben Gaiarin
...
Massimiliano Ciaramita
Joeri Rogelj
Christian Buck
Lierni Sestorain Saralegui
Reto Knutti
HILMELM
313
0
0
29 Oct 2025
Evidence-Bound Autonomous Research (EviBound): A Governance Framework for Eliminating False Claims
Evidence-Bound Autonomous Research (EviBound): A Governance Framework for Eliminating False Claims
Ruiying Chen
99
0
0
28 Oct 2025
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
MAD-Fact: A Multi-Agent Debate Framework for Long-Form Factuality Evaluation in LLMs
Yucheng Ning
Xixun Lin
Fang Fang
Yanan Cao
HILM
313
0
0
27 Oct 2025
SafetyPairs: Isolating Safety Critical Image Features with Counterfactual Image Generation
SafetyPairs: Isolating Safety Critical Image Features with Counterfactual Image Generation
Alec Helbling
Shruti Palaskar
Kundan Krishna
Polo Chau
Leon A Gatys
Joseph Y Cheng
EGVM
198
1
0
24 Oct 2025
JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
JointCQ: Improving Factual Hallucination Detection with Joint Claim and Query Generation
F. Xu
Huixuan Zhang
Zhenliang Zhang
Jiahao Wang
Xiaojun Wan
HILM
200
0
0
22 Oct 2025
Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
Fine-Tuned Thoughts: Leveraging Chain-of-Thought Reasoning for Industrial Asset Health Monitoring
Shuxin Lin
Dhaval Patel
Christodoulos Constantinides
LRM
105
1
0
21 Oct 2025
Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations
Train for Truth, Keep the Skills: Binary Retrieval-Augmented Reward Mitigates Hallucinations
Tong Chen
Akari Asai
Luke Zettlemoyer
Hannaneh Hajishirzi
Faeze Brahman
OffRLHILMLRM
189
0
0
20 Oct 2025
ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models
ESI: Epistemic Uncertainty Quantification via Semantic-preserving Intervention for Large Language Models
Mingda Li
Xinyu Li
Weinan Zhang
Longxuan Ma
139
0
0
15 Oct 2025
The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers
The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers
Saad Obaid ul Islam
Anne Lauscher
Goran Glavaš
HILM
212
0
0
13 Oct 2025
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Yingjia Wan
Haochen Tan
Xiao Zhu
Xinyu Zhou
Z. Li
...
Jiaqi Zeng
Yi Xu
Jianqiao Lu
Yinhong Liu
Zhijiang Guo
HILMOffRL
564
0
0
13 Oct 2025
Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation
Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation
Xiangxu Zhang
Lei Li
Yanyun Zhou
Xiao Zhou
Y. Zhang
Xian Wu
LM&MAELM
185
0
0
10 Oct 2025
Large Language Models Do NOT Really Know What They Don't Know
Large Language Models Do NOT Really Know What They Don't Know
C. Cheang
Hou Pong Chan
Wenxuan Zhang
Yang Deng
HILM
154
0
0
10 Oct 2025
Automated Refinement of Essay Scoring Rubrics for Language Models via Reflect-and-Revise
Automated Refinement of Essay Scoring Rubrics for Language Models via Reflect-and-Revise
Keno Harada
Lui Yoshida
Takeshi Kojima
Yusuke Iwasawa
Yutaka Matsuo
109
0
0
10 Oct 2025
Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation
Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation
Adam Dejl
James Barry
Alessandra Pascale
Javier Carnerero-Cano
HILMELM
122
0
0
09 Oct 2025
PrismGS: Physically-Grounded Anti-Aliasing for High-Fidelity Large-Scale 3D Gaussian Splatting
PrismGS: Physically-Grounded Anti-Aliasing for High-Fidelity Large-Scale 3D Gaussian Splatting
Houqiang Zhong
Zhenglong Wu
Sihua Fu
Zihan Zheng
Xin Jin
X. Zhang
Li Song
Q. Hu
3DGS
117
5
0
09 Oct 2025
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
Joseph Enguehard
Morgane Van Ermengem
Kate Atkinson
Sujeong Cha
Arijit Ghosh Chowdhury
...
Jeremy Roghair
Hannah R Marlowe
Carina Suzana Negreanu
Kitty Boxall
Diana Mincu
AILawELM
164
0
0
08 Oct 2025
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Gagan Bhatia
Somayajulu G Sripada
Kevin Allan
Jacobo Azcona
HILMLRM
274
1
0
07 Oct 2025
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
Elisei Rykov
Kseniia Petrushina
Maksim Savkin
Valerii Olisov
Artem Vazhentsev
Kseniia Titova
Ilseyar Alimova
Vasily Konovalov
Julia Belikova
HILM
191
2
0
06 Oct 2025
The Geometry of Truth: Layer-wise Semantic Dynamics for Hallucination Detection in Large Language Models
The Geometry of Truth: Layer-wise Semantic Dynamics for Hallucination Detection in Large Language Models
Amir Hameed Mir
HILM
149
0
0
06 Oct 2025
Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
Sayan Ghosh
Shahzaib Saqib Warraich
Dhruv Tarsadiya
Gregory Yauney
Swabha Swayamdipta
117
0
0
03 Oct 2025
Reward Models are Metrics in a Trench Coat
Reward Models are Metrics in a Trench Coat
Sebastian Gehrmann
144
0
0
03 Oct 2025
Knowledge-Graph Based RAG System Evaluation Framework
Knowledge-Graph Based RAG System Evaluation Framework
Sicheng Dong
Vahid Zolfaghari
Nenad Petrovic
Alois C. Knoll
145
0
0
02 Oct 2025
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models
Shenxu Chang
Junchi Yu
Weixing Wang
Yongqiang Chen
Jialin Yu
Philip Torr
Jindong Gu
HILM
153
0
0
30 Sep 2025
KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning
KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning
Xilin Dang
Kexin Chen
Xiaorui Su
Ayush Noori
Inaki Arango
Lucas Vittor
Xinyi Long
Yuyang Du
Marinka Zitnik
Pheng-Ann Heng
116
1
0
29 Sep 2025
Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality
Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality
Junliang Li
Yucheng Wang
Yan Chen
Yu Ran
Ruiqing Zhang
Jing Liu
H. Wu
Haifeng Wang
OffRLHILM
143
0
0
28 Sep 2025
EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos
EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos
Sourjyadip Ray
Shubham Sharma
Somak Aditya
Pawan Goyal
AI4Ed
223
0
0
28 Sep 2025
Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models
Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models
Sina J. Semnani
Jirayu Burapacheep
Arpandeep Khatua
Thanawan Atchariyachanvanit
Zheng Wang
M. Lam
KELM
124
1
0
27 Sep 2025
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
Yehonatan Peisakhovsky
Zorik Gekhman
Y. Mass
Liat Ein-Dor
Roi Reichart
HILM
159
1
0
26 Sep 2025
Comparative Personalization for Multi-document Summarization
Comparative Personalization for Multi-document Summarization
Haoyuan Li
Snigdha Chaturvedi
108
0
0
25 Sep 2025
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Guo Chen
Qiuyuan Li
Qiuxian Li
Hongliang Dai
Xiang Chen
Piji Li
3DVHILM
162
0
0
25 Sep 2025
Memory in Large Language Models: Mechanisms, Evaluation and Evolution
Memory in Large Language Models: Mechanisms, Evaluation and Evolution
D. Zhang
Wendong Li
Kani Song
Jiaye Lu
Gang Li
Liuchun Yang
Sheng Li
KELM
213
1
0
23 Sep 2025
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions
LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions
Xixun Lin
Yucheng Ning
Jingwen Zhang
Yan Dong
Y. Liu
...
Bin Wang
Yanan Cao
Kai-xiang Chen
Songlin Hu
Li Guo
LLMAGLRM
341
5
0
23 Sep 2025
1234...111213
Next