Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.14251
Cited By
v1
v2 (latest)
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Anuj Kumar
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
50 / 615 papers shown
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Wen Luo
Tianshu Shen
Wei Li
Guangyue Peng
Richeng Xuan
Houfeng Wang
Xi Yang
HILM
298
25
0
11 Jun 2024
Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges
Abhilasha Sancheti
Koustava Goswami
Balaji Vasan Srinivasan
RALM
266
4
0
11 Jun 2024
A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation
Bairu Hou
Yang Zhang
Jacob Andreas
Shiyu Chang
299
18
0
11 Jun 2024
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Joongwon Kim
Bhargavi Paranjape
Tushar Khot
Hannaneh Hajishirzi
LM&Ro
ELM
LLMAG
LRM
253
11
0
10 Jun 2024
Verifiable Generation with Subsentence-Level Fine-Grained Citations
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Shuyang Cao
Lu Wang
300
11
0
10 Jun 2024
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation
Neeraj Varshney
Satyam Raj
Venkatesh Mishra
Agneet Chatterjee
Ritika Sarkar
Amir Saeidi
Chitta Baral
LRM
265
22
0
08 Jun 2024
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
International Conference on Learning Representations (ICLR), 2024
Bill Yuchen Lin
Yuntian Deng
Khyathi Chandu
Faeze Brahman
Abhilasha Ravichander
Valentina Pyatkin
Nouha Dziri
Ronan Le Bras
Yejin Choi
268
139
0
07 Jun 2024
MAIRA-2: Grounded Radiology Report Generation
Shruthi Bannur
Kenza Bouzid
Daniel Coelho De Castro
Anton Schwaighofer
Sam Bond-Taylor
...
Anja Thieme
M. Lungren
Maria T. A. Wetscherek
Javier Alvarez-Valle
Stephanie L. Hyland
220
102
0
06 Jun 2024
PaCE: Parsimonious Concept Engineering for Large Language Models
Neural Information Processing Systems (NeurIPS), 2024
Jinqi Luo
Tianjiao Ding
Kwan Ho Ryan Chan
D. Thaker
Aditya Chattopadhyay
Chris Callison-Burch
René Vidal
CVBM
260
16
0
06 Jun 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
392
125
0
04 Jun 2024
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRL
KELM
AILaw
254
68
0
03 Jun 2024
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Ryo Kamoi
Yusen Zhang
Nan Zhang
Jiawei Han
Rui Zhang
LRM
381
150
0
03 Jun 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong Liu
3DV
268
6
0
29 May 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Minghan Li
Xilun Chen
Ari Holtzman
Beidi Chen
Jimmy Lin
Anuj Kumar
Xi Lin
RALM
BDL
707
21
0
29 May 2024
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
Jaewoo Ahn
Taehyun Lee
Junyoung Lim
Jin-Hwa Kim
Sangdoo Yun
Hwaran Lee
Gunhee Kim
LLMAG
HILM
246
19
0
28 May 2024
GRAG: Graph Retrieval-Augmented Generation
Yuntong Hu
Zhihan Lei
Zhengwu Zhang
Bo Pan
Chen Ling
Bo Pan
510
77
0
26 May 2024
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
Yun Zhu
Jia-Chen Gu
Caitlin Sikora
Ho Ko
Yinxiao Liu
...
Lei Shu
Liangchen Luo
Lei Meng
Bang Liu
Jindong Chen
RALM
247
24
0
25 May 2024
Certifiably Robust RAG against Retrieval Corruption
Chong Xiang
Tong Wu
Zexuan Zhong
David Wagner
Danqi Chen
Prateek Mittal
SILM
305
88
0
24 May 2024
AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings
R. Reddy
Omar Attia
Yunyao Li
Heng Ji
Saloni Potdar
146
1
0
23 May 2024
RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Xiangkun Hu
Dongyu Ru
Lin Qiu
Qipeng Guo
Tianhang Zhang
Yang Xu
Yun Luo
Pengfei Liu
Yue Zhang
Zheng Zhang
HILM
LRM
267
18
0
23 May 2024
Can LLMs Solve longer Math Word Problems Better?
International Conference on Learning Representations (ICLR), 2024
Xin Xu
Tong Xiao
Zitong Chao
Zhenya Huang
Can Yang
Yang Wang
540
24
0
23 May 2024
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models
Guangzhi Sun
Potsawee Manakul
Adian Liusie
Kunat Pipatanakul
Chao Zhang
P. Woodland
Mark Gales
HILM
MLLM
226
12
0
22 May 2024
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet
Behrooz Omidvar-Tehrani
Hao Ding
Laurent Callot
RALM
277
33
0
22 May 2024
Atomic Self-Consistency for Better Long Form Generations
Raghuveer Thirukovalluru
Yukun Huang
Bhuwan Dhingra
217
12
0
21 May 2024
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
Minbyul Jeong
Hyeon Hwang
Chanwoong Yoon
Taewhoo Lee
Jaewoo Kang
MedIm
HILM
LM&MA
435
19
0
21 May 2024
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Hai-Tao Zheng
Min Li
Wanxiang Che
Philip S. Yu
LRM
ALM
LM&MA
ELM
455
119
0
21 May 2024
Question-Based Retrieval using Atomic Units for Enterprise RAG
Vatsal Raina
Mark Gales
132
19
0
20 May 2024
SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained Evaluation
Yuwei Wan
Yixuan Liu
Aswathy Ajith
Clara Grazian
B. Hoex
Wenjie Zhang
Chunyu Kit
Tong Xie
Ian Foster
224
19
0
16 May 2024
LLMs can learn self-restraint through iterative self-reflection
Alexandre Piché
Aristides Milios
Dzmitry Bahdanau
Chris Pal
347
6
0
15 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
405
225
0
09 May 2024
One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations
Conference on Fairness, Accountability and Transparency (FAccT), 2024
Yoonjoo Lee
Kihoon Son
Tae Soo Kim
Jisu Kim
John Joon Young Chung
Eytan Adar
Juho Kim
230
28
0
09 May 2024
OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs
Yuxia Wang
Minghan Wang
Hasan Iqbal
Georgi Georgiev
Fauzan Farooqui
Preslav Nakov
HILM
415
36
0
09 May 2024
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training
Zexuan Zhong
Mengzhou Xia
Danqi Chen
Mike Lewis
MoE
212
27
0
06 May 2024
Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents
Sneha Singhania
Simon Razniewski
Gerhard Weikum
RALM
418
2
0
04 May 2024
FLAME: Factuality-Aware Alignment for Large Language Models
Neural Information Processing Systems (NeurIPS), 2024
Sheng-Chieh Lin
Luyu Gao
Barlas Oğuz
Wenhan Xiong
Jimmy Lin
Anuj Kumar
Xilun Chen
HILM
191
41
0
02 May 2024
On the Evaluation of Machine-Generated Reports
James Mayfield
Eugene Yang
Dawn J Lawrie
Sean MacAvaney
Paul McNamee
...
Orion Weller
Efsun Kayi
Kate Sanders
Orion Weller
Noah Hibbler
ALM
332
27
0
02 May 2024
GRAMMAR: Grounded and Modular Methodology for Assessment of Closed-Domain Retrieval-Augmented Language Model
Xinzhe Li
Ming Liu
Shang Gao
RALM
399
0
0
30 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
548
132
0
23 Apr 2024
ISQA: Informative Factuality Feedback for Scientific Summarization
Zekai Li
Yanxia Qin
Qian Liu
Min-Yen Kan
HILM
240
2
0
20 Apr 2024
AmbigDocs: Reasoning across Documents on Different Entities under the Same Name
Yoonsang Lee
Xi Ye
Eunsol Choi
377
16
0
18 Apr 2024
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models
Sunhao Dai
Chen Xu
Shicheng Xu
Liang Pang
Zhenhua Dong
Jun Xu
303
3
0
17 Apr 2024
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document
Joonho Yang
Seunghyun Yoon
Byeongjeong Kim
Hwanhee Lee
HILM
317
13
0
17 Apr 2024
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang
Philippe Laban
Greg Durrett
HILM
SyDa
327
171
0
16 Apr 2024
NoticIA: A Clickbait Article Summarization Dataset in Spanish
Iker García-Ferrero
Begoña Altuna
324
5
0
11 Apr 2024
Best Practices and Lessons Learned on Synthetic Data for Language Models
Ruibo Liu
Jerry W. Wei
Fangyu Liu
Chenglei Si
Yanzhe Zhang
...
Steven Zheng
Daiyi Peng
Diyi Yang
Denny Zhou
Andrew M. Dai
SyDa
EgoV
303
112
0
11 Apr 2024
Pitfalls of Conversational LLMs on News Debiasing
Ipek Baris Schlicht
Defne Altiok
Maryanne Taouk
Lucie Flek
237
4
0
09 Apr 2024
Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports
Tianyu Cao
Natraj Raman
Danial Dervovic
Chenhao Tan
147
7
0
09 Apr 2024
Know When To Stop: A Study of Semantic Drift in Text Generation
Ava Spataru
Eric Hambro
Elena Voita
Nicola Cancedda
236
9
0
08 Apr 2024
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Liqiang Jing
Xinya Du
382
29
0
07 Apr 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Derui Zhu
Dingfan Chen
Qing Li
Zongxiong Chen
Lei Ma
Jens Grossklags
Mario Fritz
HILM
203
19
0
06 Apr 2024
Previous
1
2
3
...
10
11
12
13
8
9
Next