ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Anuj Kumar
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 608 papers shown
Title
The Psychology of Falsehood: A Human-Centric Survey of Misinformation Detection
The Psychology of Falsehood: A Human-Centric Survey of Misinformation Detection
Arghodeep Nandi
Megha Sundriyal
Euna Mehnaz Khan
Jikai Sun
Emily Vraga
Jaideep Srivastava
Tanmoy Chakraborty
OffRLHILM
112
0
0
19 Sep 2025
Quantifying Self-Awareness of Knowledge in Large Language Models
Quantifying Self-Awareness of Knowledge in Large Language Models
Yeongbin Seo
Dongha Lee
Jinyoung Yeo
HILM
76
0
0
18 Sep 2025
DSCC-HS: A Dynamic Self-Reinforcing Framework for Hallucination Suppression in Large Language Models
DSCC-HS: A Dynamic Self-Reinforcing Framework for Hallucination Suppression in Large Language Models
Xiao Zheng
HILM
96
0
0
17 Sep 2025
HalluDetect: Detecting, Mitigating, and Benchmarking Hallucinations in Conversational Systems in the Legal Domain
HalluDetect: Detecting, Mitigating, and Benchmarking Hallucinations in Conversational Systems in the Legal Domain
Spandan Anaokar
Shrey Ganatra
Harshvivek Kashid
Swapnil Bhattacharyya
Shruti Nair
Reshma Sekhar
Siddharth Manohar
Rahul Hemrajani
Pushpak Bhattacharyya
HILM
124
0
0
15 Sep 2025
Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition
Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition
Danielle Cohen
Yoni Halpern
Noam Kahlon
Joel Oren
Omri Berkovitch
Sapir Caduri
Ido Dagan
Anatoly Efros
72
0
0
15 Sep 2025
Introducing Spotlight: A Novel Approach for Generating Captivating Key Information from Documents
Introducing Spotlight: A Novel Approach for Generating Captivating Key Information from Documents
Ankan Mullick
Sombit Bose
Rounak Saha
Ayan Kumar Bhowmick
Aditya Vempaty
Prasenjit Dey
Ravi Kokku
Pawan Goyal
Niloy Ganguly
148
1
0
13 Sep 2025
Unsupervised Hallucination Detection by Inspecting Reasoning Processes
Unsupervised Hallucination Detection by Inspecting Reasoning Processes
Ponhvoan Srey
Xiaobao Wu
Anh Tuan Luu
HILM
88
0
0
12 Sep 2025
LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge
LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge
Dima Galat
Diego Mollá Aliod
57
2
0
10 Sep 2025
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
Lukas Haas
Gal Yona
Giovanni DÁntonio
Sasha Goldshtein
Dipanjan Das
HILMALM
41
5
0
09 Sep 2025
Instance-level Performance Prediction for Long-form Generation Tasks
Instance-level Performance Prediction for Long-form Generation Tasks
Chi-Yang Hsu
Alexander Braylan
Yiheng Su
Omar Alonso
Matthew Lease
88
0
0
09 Sep 2025
ZhiFangDanTai: Fine-tuning Graph-based Retrieval-Augmented Generation Model for Traditional Chinese Medicine Formula
ZhiFangDanTai: Fine-tuning Graph-based Retrieval-Augmented Generation Model for Traditional Chinese Medicine Formula
ZiXuan Zhang
Bowen Hao
Yingjie Li
Hongzhi Yin
53
0
0
06 Sep 2025
Chatbot To Help Patients Understand Their Health
Chatbot To Help Patients Understand Their Health
Won Seok Jang
Hieu Tran
Manav Mistry
SaiKiran Gandluri
Yifan Zhang
Sharmin Sultana
Sunjae Kown
Yuan-kang Zhang
Zonghai Yao
Hong-ye Yu
AI4MHLM&MA
127
0
0
06 Sep 2025
FActBench: A Benchmark for Fine-grained Automatic Evaluation of LLM-Generated Text in the Medical Domain
FActBench: A Benchmark for Fine-grained Automatic Evaluation of LLM-Generated Text in the Medical Domain
Anum Afzal
Juraj Vladika
Florian Matthes
HILMLRM
57
0
0
02 Sep 2025
Can Smaller LLMs do better? Unlocking Cross-Domain Potential through Parameter-Efficient Fine-Tuning for Text Summarization
Can Smaller LLMs do better? Unlocking Cross-Domain Potential through Parameter-Efficient Fine-Tuning for Text Summarization
Anum Afzal
Mehul Kumawat
Florian Matthes
ALM
68
1
0
01 Sep 2025
Enhancing Health Fact-Checking with LLM-Generated Synthetic Data
Enhancing Health Fact-Checking with LLM-Generated Synthetic Data
Jingze Zhang
Jiahe Qian
Yiliang Zhou
Yifan Peng
SyDaHILMMedIm
90
0
0
28 Aug 2025
Real-Time Detection of Hallucinated Entities in Long-Form Generation
Real-Time Detection of Hallucinated Entities in Long-Form Generation
Oscar Obeso
Andy Arditi
Javier Ferrando
Joshua Freeman
Cameron Holmes
Neel Nanda
HILM
145
5
0
26 Aug 2025
If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
Shubhashis Roy Dipta
Francis Ferraro
AAML
89
0
0
22 Aug 2025
Identifying and Answering Questions with False Assumptions: An Interpretable Approach
Identifying and Answering Questions with False Assumptions: An Interpretable Approach
Zijie Wang
Eduardo Blanco
HILM
156
0
0
21 Aug 2025
Attribution, Citation, and Quotation: A Survey of Evidence-based Text Generation with Large Language Models
Attribution, Citation, and Quotation: A Survey of Evidence-based Text Generation with Large Language Models
Tobias Schreieder
Tim Schopf
Michael Färber
HILM
96
1
0
21 Aug 2025
SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation
SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation
Weihang Su
Anzhe Xie
Jiaxin Mao
Jianming Long
Jiaxin Mao
Ziyi Ye
Yiqun Liu
118
0
0
21 Aug 2025
Are Checklists Really Useful for Automatic Evaluation of Generative Tasks?
Are Checklists Really Useful for Automatic Evaluation of Generative Tasks?
Momoka Furuhashi
Kouta Nakayama
Takashi Kodama
Saku Sugawara
ALMELM
124
1
0
21 Aug 2025
LongRecall: A Structured Approach for Robust Recall Evaluation in Long-Form Text
LongRecall: A Structured Approach for Robust Recall Evaluation in Long-Form Text
MohamamdJavad Ardestani
Ehsan Kamalloo
Davood Rafiei
84
1
0
20 Aug 2025
TracSum: A New Benchmark for Aspect-Based Summarization with Sentence-Level Traceability in Medical Domain
TracSum: A New Benchmark for Aspect-Based Summarization with Sentence-Level Traceability in Medical Domain
Bohao Chu
Meijie Li
Sameh Frihat
Chengyu Gu
Georg Lodde
Elisabeth Livingstone
Norbert Fuhr
HILM
80
0
0
19 Aug 2025
DiFaR: Enhancing Multimodal Misinformation Detection with Diverse, Factual, and Relevant Rationales
DiFaR: Enhancing Multimodal Misinformation Detection with Diverse, Factual, and Relevant Rationales
Herun Wan
Jiaying Wu
Minnan Luo
Xiangzheng Kong
Zihan Ma
Zhi Zeng
84
1
0
14 Aug 2025
Hide or Highlight: Understanding the Impact of Factuality Expression on User Trust
Hide or Highlight: Understanding the Impact of Factuality Expression on User Trust
Hyo Jin Do
Werner Geyer
HILM
70
0
0
09 Aug 2025
FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in Finance
FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in Finance
Mengao Zhang
Jiayu Fu
Tanya Warrier
Yuwen Wang
Tianhui Tan
Ke-wei Huang
72
1
0
07 Aug 2025
Learning to Reason for Factuality
Learning to Reason for Factuality
Xilun Chen
Ilia Kulikov
Vincent-Pierre Berges
Barlas Oğuz
Rulin Shao
Gargi Ghosh
Jason Weston
Anuj Kumar
OffRLHILMLRM
125
6
0
07 Aug 2025
The SMeL Test: A simple benchmark for media literacy in language models
The SMeL Test: A simple benchmark for media literacy in language models
Gustaf Ahdritz
Anat Kleiman
189
0
0
04 Aug 2025
LMAR: Language Model Augmented Retriever for Domain-specific Knowledge Indexing
LMAR: Language Model Augmented Retriever for Domain-specific Knowledge Indexing
Yao Zhao
Yantian Ding
Zhiyue Zhang
Dapeng Yao
Yanxun Xu
RALM
227
1
0
04 Aug 2025
CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions
CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions
Tae Soo Kim
Yoonjoo Lee
Yoonah Park
Jiho Kim
Young-Ho Kim
Juho Kim
154
1
0
03 Aug 2025
FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality
FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality
Mingda Chen
Yang Li
Xilun Chen
Adina Williams
Gargi Ghosh
Scott Yih
HILMALM
128
2
0
31 Jul 2025
Investigating Hallucination in Conversations for Low Resource Languages
Investigating Hallucination in Conversations for Low Resource Languages
A. Das
M. Hasan
Souvika Sarkar
Zheng Zhang
Fatemeh Jamshidi
Tathagata Bhattacharya
Nilanjana Raychawdhury
Dongji Feng
Vinija Jain
Vasu Sharma
HILM
251
0
0
30 Jul 2025
CUS-QA: Local-Knowledge-Oriented Open-Ended Question Answering Dataset
CUS-QA: Local-Knowledge-Oriented Open-Ended Question Answering Dataset
Jindrich Libovický
Jindřich Helcl
Andrei-Alexandru Manea
Gianluca Vico
144
1
0
30 Jul 2025
Towards a rigorous evaluation of RAG systems: the challenge of due diligence
Towards a rigorous evaluation of RAG systems: the challenge of due diligence
Grégoire Martinon
Alexandra Lorenzo de Brionne
Jérôme Bohard
Antoine Lojou
Damien Hervault
Nicolas Brunel
152
1
0
29 Jul 2025
MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
MIRAGE-Bench: LLM Agent is Hallucinating and Where to Find Them
Weichen Zhang
Yiyou Sun
Pohao Huang
Jiayue Pu
Heyue Lin
Dawn Song
LLMAGHILM
155
0
0
28 Jul 2025
Enhancing Hallucination Detection via Future Context
Enhancing Hallucination Detection via Future Context
J. H. Lee
Cheonbok Park
Hwiyeol Jo
Jeonghoon Kim
Joonsuk Park
Kang Min Yoo
HILM
88
0
0
28 Jul 2025
Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
Shengyuan Wang
J. Feng
Tianhui Liu
Dan Pei
Yong Li
HILM
121
0
0
25 Jul 2025
DxHF: Providing High-Quality Human Feedback for LLM Alignment via Interactive Decomposition
DxHF: Providing High-Quality Human Feedback for LLM Alignment via Interactive DecompositionACM Symposium on User Interface Software and Technology (UIST), 2025
Danqing Shi
Furui Cheng
Tino Weinkauf
Antti Oulasvirta
Mennatallah El-Assady
106
1
0
24 Jul 2025
SynthTextEval: Synthetic Text Data Generation and Evaluation for High-Stakes Domains
SynthTextEval: Synthetic Text Data Generation and Evaluation for High-Stakes Domains
Krithika Ramesh
Daniel Smolyak
Zihao Zhao
Nupoor Gandhi
Ritu Agarwal
Margrét V. Bjarnadóttir
Anjalie Field
SyDaELM
362
1
0
09 Jul 2025
MedVAL: Toward Expert-Level Medical Text Validation with Language Models
MedVAL: Toward Expert-Level Medical Text Validation with Language Models
Asad Aali
Vasiliki Bikia
M. Varma
Nicole Chiou
Sophie Ostmeier
...
R. Daneshjou
Jason Hom
Sanmi Koyejo
Emily Alsentzer
Akshay Chaudhari
LM&MAELM
185
2
0
03 Jul 2025
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
Baochang Ren
Shuofei Qiao
Da Zheng
Huajun Chen
Ningyu Zhang
OffRLLRM
168
5
0
24 Jun 2025
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation
Jiahao Cheng
Tiancheng Su
Jia Yuan
Guoxiu He
Jiawei Liu
Xinqi Tao
Jingwen Xie
Huaxia Li
HILMLRM
260
8
0
20 Jun 2025
MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
Jushaan Singh Kalra
Xinran Zhao
To Eun Kim
Fengyu Cai
Fernando Diaz
Tongshuang Wu
VLM
238
0
0
18 Jun 2025
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yongqi Fan
Yating Wang
Guandong Wang
Jie Zhai
Jingping Liu
Qi Ye
Tong Ruan
119
0
0
18 Jun 2025
A Vision for Geo-Temporal Deep Research Systems: Towards Comprehensive, Transparent, and Reproducible Geo-Temporal Information Synthesis
A Vision for Geo-Temporal Deep Research Systems: Towards Comprehensive, Transparent, and Reproducible Geo-Temporal Information Synthesis
Bruno Martins
Piotr Szymañski
Piotr Gramacki
139
0
0
17 Jun 2025
GenerationPrograms: Fine-grained Attribution with Executable Programs
GenerationPrograms: Fine-grained Attribution with Executable Programs
David Wan
Eran Hirsch
Elias Stengel-Eskin
Ido Dagan
Mohit Bansal
223
0
0
17 Jun 2025
How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval
How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval
William Walden
Kathryn Ricci
Miriam Wanner
Zhengping Jiang
Chandler May
Rongkun Zhou
Benjamin Van Durme
HILM
120
0
0
14 Jun 2025
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking
RealFactBench: A Benchmark for Evaluating Large Language Models in Real-World Fact-Checking
Shuo Yang
Yuqin Dai
Guoqing Wang
Xinran Zheng
Jinfeng Xu
Jinze Li
ZhenZhe Ying
Weiqiang Wang
Edith C. -H. Ngai
HILMLRM
128
6
0
14 Jun 2025
Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation
Improving Factuality for Dialogue Response Generation via Graph-Based Knowledge Augmentation
Xiangyan Chen
Yujian Gan
Yimeng Gu
Matthew Purver
HILM
187
1
0
14 Jun 2025
KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs
KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs
Dingjun Wu
Y. Yan
Zhenghao Liu
Zhiyuan Liu
Maosong Sun
209
2
0
11 Jun 2025
Previous
12345...111213
Next