ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14251
  4. Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long
  Form Text Generation
v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Anuj Kumar
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
    HILMALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 608 papers shown
Title
FactPICO: Factuality Evaluation for Plain Language Summarization of
  Medical Evidence
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence
Sebastian Antony Joseph
Lily Chen
Jan Trienes
Hannah Louisa Göke
Monika Coers
Wei Xu
Byron C. Wallace
Junyi Jessy Li
LM&MAHILM
153
21
0
18 Feb 2024
Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization
Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization
Liqiang Jing
Jingxuan Zuo
Yue Zhang
Liqiang Jing
271
13
0
18 Feb 2024
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Yougang Lyu
Lingyong Yan
Shuaiqiang Wang
Haibo Shi
D. Yin
Sudipta Singha Roy
Zhumin Chen
Maarten de Rijke
Zhaochun Ren
175
10
0
17 Feb 2024
GenRES: Rethinking Evaluation for Generative Relation Extraction in the
  Era of Large Language Models
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models
Pengcheng Jiang
Jiacheng Lin
Zifeng Wang
Jimeng Sun
Jiawei Han
123
9
0
16 Feb 2024
Comparing Hallucination Detection Metrics for Multilingual Generation
Comparing Hallucination Detection Metrics for Multilingual Generation
Haoqiang Kang
Terra Blevins
Luke Zettlemoyer
HILM
259
24
0
16 Feb 2024
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
Jiaheng Wei
Yuanshun Yao
Jean-François Ton
Hongyi Guo
Andrew Estornell
Yang Liu
HILM
301
36
0
16 Feb 2024
Rowen: Adaptive Retrieval-Augmented Generation for Hallucination Mitigation in LLMs
Rowen: Adaptive Retrieval-Augmented Generation for Hallucination Mitigation in LLMs
Hanxing Ding
Liang Pang
Zihao Wei
Huawei Shen
Xueqi Cheng
HILMRALM
374
26
0
16 Feb 2024
Language Models with Conformal Factuality Guarantees
Language Models with Conformal Factuality Guarantees
Christopher Mohri
Tatsunori Hashimoto
HILM
402
75
0
15 Feb 2024
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's
  Hidden States
Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
Hanyu Duan
Yi Yang
Kar Yan Tam
HILM
154
47
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via
  Self-Evaluation
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Chao Yang
HILM
205
90
0
14 Feb 2024
Into the Unknown: Self-Learning Large Language Models
Into the Unknown: Self-Learning Large Language Models
Teddy Ferdinan
Jan Kocoñ
P. Kazienko
266
5
0
14 Feb 2024
InstructGraph: Boosting Large Language Models via Graph-centric
  Instruction Tuning and Preference Alignment
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment
Jianing Wang
Junda Wu
Yupeng Hou
Yao Liu
Ming Gao
Julian McAuley
206
52
0
13 Feb 2024
Towards Faithful and Robust LLM Specialists for Evidence-Based
  Question-Answering
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
Tobias Schimanski
Jingwei Ni
Mathias Kraus
Elliott Ash
Markus Leippold
187
11
0
13 Feb 2024
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for
  Safety-Aware Street Crossing
Is it safe to cross? Interpretable Risk Assessment with GPT-4V for Safety-Aware Street Crossing
Hochul Hwang
Sunjae Kwon
Yekyung Kim
Donghyun Kim
126
16
0
09 Feb 2024
Calibrating Long-form Generations from Large Language Models
Calibrating Long-form Generations from Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
173
22
0
09 Feb 2024
Large Language Models: A Survey
Large Language Models: A Survey
Shervin Minaee
Tomas Mikolov
Narjes Nikzad
M. Asgari-Chenaghlu
R. Socher
Xavier Amatriain
Jianfeng Gao
ALMLM&MAELM
703
732
0
09 Feb 2024
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature
  of Aggregated Factual Claims in Long-Form Generations
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations
Cheng-Han Chiang
Hung-yi Lee
HILM
283
13
0
08 Feb 2024
Training Language Models to Generate Text with Citations via
  Fine-grained Rewards
Training Language Models to Generate Text with Citations via Fine-grained Rewards
Chengyu Huang
Zeqiu Wu
Yushi Hu
Wenya Wang
HILMLRM
212
41
0
06 Feb 2024
Factuality of Large Language Models in the Year 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRMHILM
194
7
0
04 Feb 2024
How well do LLMs cite relevant medical references? An evaluation
  framework and analyses
How well do LLMs cite relevant medical references? An evaluation framework and analyses
Kevin Wu
Eric Wu
Ally Cassasola
Angela Zhang
Kevin Wei
Teresa Nguyen
Sith Riantawan
Patricia Shi Riantawan
James Grimmelmann
James Zou
LM&MAELMAI4MH
225
41
0
03 Feb 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
Rethinking the Role of Proxy Rewards in Language Model Alignment
Sungdong Kim
Minjoon Seo
SyDaALM
216
5
0
02 Feb 2024
A Survey on Hallucination in Large Vision-Language Models
A Survey on Hallucination in Large Vision-Language Models
Hanchao Liu
Wenyuan Xue
Yifei Chen
Dapeng Chen
Xiutian Zhao
Ke Wang
Liping Hou
Rong-Zhi Li
Wei Peng
LRMMLLM
257
233
0
01 Feb 2024
Corrective Retrieval Augmented Generation
Corrective Retrieval Augmented Generation
Shi-Qi Yan
Jia-Chen Gu
Yun Zhu
Zhen-Hua Ling
RALM
404
133
0
29 Jan 2024
PROXYQA: An Alternative Framework for Evaluating Long-Form Text
  Generation with Large Language Models
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Haochen Tan
Zhijiang Guo
Zhan Shi
Lu Xu
Zhili Liu
...
Xiaoguang Li
Yasheng Wang
Lifeng Shang
Qun Liu
Linqi Song
195
20
0
26 Jan 2024
K-QA: A Real-World Medical Q&A Benchmark
K-QA: A Real-World Medical Q&A BenchmarkWorkshop on Biomedical Natural Language Processing (BioNLP), 2024
Itay Manes
Naama Ronn
David Cohen
Ran Ilan Ber
Zehavi Horowitz-Kugler
Gabriel Stanovsky
LM&MAHILMAI4MH
179
21
0
25 Jan 2024
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Haritz Puerto
Martin Tutek
Somak Aditya
Xiaodan Zhu
Iryna Gurevych
ReCodReLMLRM
275
16
0
18 Jan 2024
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on
  Agriculture
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
M. A. D. L. Balaguer
Vinamra Benara
Renato Luiz de Freitas Cunha
Roberto de M. Estevao Filho
Todd Hendry
...
Morris Sharp
B. Silva
Swati Sharma
Vijay Aski
Ranveer Chandra
FaML
353
136
0
16 Jan 2024
Leveraging Large Language Models for NLG Evaluation: Advances and
  Challenges
Leveraging Large Language Models for NLG Evaluation: Advances and ChallengesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MAELM
301
32
0
13 Jan 2024
Fine-grained Hallucination Detection and Editing for Language Models
Fine-grained Hallucination Detection and Editing for Language Models
Abhika Mishra
Akari Asai
Vidhisha Balachandran
Yizhong Wang
Graham Neubig
Yulia Tsvetkov
Hannaneh Hajishirzi
HILM
283
124
0
12 Jan 2024
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
EASYTOOL: Enhancing LLM-based Agents with Concise Tool InstructionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Siyu Yuan
Kaitao Song
Jiangjie Chen
Xu Tan
Yongliang Shen
Ren Kan
Dongsheng Li
Deqing Yang
LLMAG
198
105
0
11 Jan 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
ZuJie Wen
Ke Xu
Qi Li
260
87
0
11 Jan 2024
LightHouse: A Survey of AGI Hallucination
LightHouse: A Survey of AGI Hallucination
Feng Wang
LRMHILMVLM
256
4
0
08 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and
  Solutions
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
158
11
0
04 Jan 2024
Large Legal Fictions: Profiling Legal Hallucinations in Large Language
  Models
Large Legal Fictions: Profiling Legal Hallucinations in Large Language ModelsJournal of Legal Analysis (JLA), 2024
Matthew Dahl
Varun Magesh
Mirac Suzgun
James Grimmelmann
HILMAILaw
345
143
0
02 Jan 2024
Do Androids Know They're Only Dreaming of Electric Sheep?
Do Androids Know They're Only Dreaming of Electric Sheep?Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Sky CH-Wang
Benjamin Van Durme
Jason Eisner
Chris Kedzie
HILM
230
52
0
28 Dec 2023
Alleviating Hallucinations of Large Language Models through Induced
  Hallucinations
Alleviating Hallucinations of Large Language Models through Induced Hallucinations
Yue Zhang
Leyang Cui
Wei Bi
Shuming Shi
HILM
251
73
0
25 Dec 2023
DSPy Assertions: Computational Constraints for Self-Refining Language
  Model Pipelines
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
Arnav Singhvi
Manish Shetty
Shangyin Tan
Christopher Potts
Koushik Sen
Matei A. Zaharia
Omar Khattab
234
30
0
20 Dec 2023
Towards Verifiable Text Generation with Evolving Memory and
  Self-Reflection
Towards Verifiable Text Generation with Evolving Memory and Self-ReflectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hao Sun
Hengyi Cai
Bo Wang
Yingyan Hou
Xiaochi Wei
Shuaiqiang Wang
Yan Zhang
D. Yin
311
16
0
14 Dec 2023
Evaluating Large Language Models for Health-related Queries with
  Presuppositions
Evaluating Large Language Models for Health-related Queries with PresuppositionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Navreet Kaur
Monojit Choudhury
Danish Pruthi
HILMELM
190
11
0
14 Dec 2023
Alignment for Honesty
Alignment for HonestyNeural Information Processing Systems (NeurIPS), 2023
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
205
57
0
12 Dec 2023
Dense X Retrieval: What Retrieval Granularity Should We Use?
Dense X Retrieval: What Retrieval Granularity Should We Use?
Tong Chen
Hongwei Wang
Sihao Chen
Wenhao Yu
Kaixin Ma
Xinran Zhao
Hongming Zhang
Dong Yu
211
68
0
11 Dec 2023
User Modeling in the Era of Large Language Models: Current Research and
  Future Directions
User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan
Meng Jiang
287
19
0
11 Dec 2023
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models
  Catching up?
ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?
Hailin Chen
Fangkai Jiao
Xingxuan Li
Chengwei Qin
Mathieu Ravaut
Ruochen Zhao
Caiming Xiong
Shafiq Joty
ELMCLLAI4MHLRMALM
252
31
0
28 Nov 2023
RELIC: Investigating Large Language Model Responses using
  Self-Consistency
RELIC: Investigating Large Language Model Responses using Self-ConsistencyInternational Conference on Human Factors in Computing Systems (CHI), 2023
Furui Cheng
Vilém Zouhar
Simran Arora
Mrinmaya Sachan
Hendrik Strobelt
Mennatallah El-Assady
HILM
248
40
0
28 Nov 2023
A Survey of the Evolution of Language Model-Based Dialogue Systems: Data, Task and Models
A Survey of the Evolution of Language Model-Based Dialogue Systems: Data, Task and Models
Hongru Wang
Lingzhi Wang
Yiming Du
Liang Chen
Jing Zhou
Yufei Wang
Kam-Fai Wong
LRM
378
29
0
28 Nov 2023
Deficiency of Large Language Models in Finance: An Empirical Examination
  of Hallucination
Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination
Haoqiang Kang
Xiao-Yang Liu
RALM
171
50
0
27 Nov 2023
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models
  via Unconstrained Generation
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Xun Liang
Shichao Song
Pengnian Qi
Zhiyu Li
Feiyu Xiong
...
Zhaohui Wy
Dawei He
Peng Cheng
Zhonghao Wang
Haiying Deng
HILM
228
33
0
26 Nov 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Enhancing Uncertainty-Based Hallucination Detection with Stronger FocusConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
242
86
0
22 Nov 2023
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go
  without Hallucination?
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?
Bangzheng Li
Ben Zhou
Fei Wang
Xingyu Fu
Dan Roth
Muhao Chen
HILMLRM
226
28
0
16 Nov 2023
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text
  Generation
DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation
Yiqing Xie
Sheng Zhang
Hao Cheng
Pengfei Liu
Zelalem Gero
Cliff Wong
Tristan Naumann
Hoifung Poon
Carolyn Rose
MedIm
215
13
0
16 Nov 2023
Previous
123...10111213
Next