Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14251
Cited By
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Wen-tau Yih
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
50 / 455 papers shown
Title
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Ryo Kamoi
Yusen Zhang
Nan Zhang
Jiawei Han
Rui Zhang
LRM
40
57
0
03 Jun 2024
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu
Hao Zhang
Zhijiang Guo
Kuicai Dong
Xiangyang Li
Yi Quan Lee
Cong Zhang
Yong-jin Liu
3DV
31
6
0
29 May 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Minghan Li
Xilun Chen
Ari Holtzman
Beidi Chen
Jimmy Lin
Wen-tau Yih
Xi Victoria Lin
RALM
BDL
108
10
0
29 May 2024
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
Jaewoo Ahn
Taehyun Lee
Junyoung Lim
Jin-Hwa Kim
Sangdoo Yun
Hwaran Lee
Gunhee Kim
LLMAG
HILM
35
12
0
28 May 2024
GRAG: Graph Retrieval-Augmented Generation
Yuntong Hu
Zhihan Lei
Zhengwu Zhang
Bo Pan
Chen Ling
Liang Zhao
35
19
0
26 May 2024
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
Yun Zhu
Jia-Chen Gu
Caitlin Sikora
Ho Ko
Yinxiao Liu
...
Lei Shu
Liangchen Luo
Lei Meng
Bang Liu
Jindong Chen
RALM
22
14
0
25 May 2024
Certifiably Robust RAG against Retrieval Corruption
Chong Xiang
Tong Wu
Zexuan Zhong
David Wagner
Danqi Chen
Prateek Mittal
SILM
25
41
0
24 May 2024
AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings
R. Reddy
Omar Attia
Yunyao Li
Heng Ji
Saloni Potdar
32
1
0
23 May 2024
RefChecker: Reference-based Fine-grained Hallucination Checker and Benchmark for Large Language Models
Xiangkun Hu
Dongyu Ru
Lin Qiu
Qipeng Guo
Tianhang Zhang
Yang Xu
Yun Luo
Pengfei Liu
Yue Zhang
Zheng-Wei Zhang
HILM
LRM
59
8
0
23 May 2024
Can LLMs Solve longer Math Word Problems Better?
Xin Xu
Tong Xiao
Zitong Chao
Zhenya Huang
Can Yang
Yang Wang
70
10
0
23 May 2024
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models
Guangzhi Sun
Potsawee Manakul
Adian Liusie
Kunat Pipatanakul
Chao Zhang
P. Woodland
Mark J. F. Gales
HILM
MLLM
16
7
0
22 May 2024
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet
Behrooz Omidvar-Tehrani
Anoop Deoras
Laurent Callot
RALM
62
16
0
22 May 2024
Atomic Self-Consistency for Better Long Form Generations
Raghuveer Thirukovalluru
Yukun Huang
Bhuwan Dhingra
30
5
0
21 May 2024
Large Language Models Meet NLP: A Survey
Libo Qin
Qiguang Chen
Xiachong Feng
Yang Wu
Yongheng Zhang
Yinghui Li
Min Li
Wanxiang Che
Philip S. Yu
ALM
LM&MA
ELM
LRM
38
46
0
21 May 2024
OLAPH: Improving Factuality in Biomedical Long-form Question Answering
Minbyul Jeong
Hyeon Hwang
Chanwoong Yoon
Taewhoo Lee
Jaewoo Kang
MedIm
HILM
LM&MA
38
12
0
21 May 2024
Question-Based Retrieval using Atomic Units for Enterprise RAG
Vatsal Raina
Mark J. F. Gales
27
7
0
20 May 2024
SciQAG: A Framework for Auto-Generated Science Question Answering Dataset with Fine-grained Evaluation
Yuwei Wan
Yixuan Liu
Aswathy Ajith
Clara Grazian
B. Hoex
Wenjie Zhang
Chunyu Kit
Tong Xie
Ian Foster
21
7
0
16 May 2024
LLMs can learn self-restraint through iterative self-reflection
Alexandre Piché
Aristides Milios
Dzmitry Bahdanau
Chris Pal
38
5
0
15 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
48
102
0
09 May 2024
OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs
Yuxia Wang
Minghan Wang
Hasan Iqbal
Georgi Georgiev
Jiahui Geng
Preslav Nakov
HILM
36
13
0
09 May 2024
One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations
Yoonjoo Lee
Kihoon Son
Tae Soo Kim
Jisu Kim
John Joon Young Chung
Eytan Adar
Juho Kim
39
11
0
09 May 2024
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training
Zexuan Zhong
Mengzhou Xia
Danqi Chen
Mike Lewis
MoE
49
15
0
06 May 2024
Recall Them All: Retrieval-Augmented Language Models for Long Object List Extraction from Long Documents
Sneha Singhania
Simon Razniewski
G. Weikum
RALM
34
1
0
04 May 2024
FLAME: Factuality-Aware Alignment for Large Language Models
Sheng-Chieh Lin
Luyu Gao
Barlas Oğuz
Wenhan Xiong
Jimmy Lin
Wen-tau Yih
Xilun Chen
HILM
34
14
0
02 May 2024
On the Evaluation of Machine-Generated Reports
James Mayfield
Eugene Yang
Dawn J Lawrie
Sean MacAvaney
Paul McNamee
...
Orion Weller
Efsun Kayi
Kate Sanders
Marc Mason
Noah Hibbler
ALM
77
12
0
02 May 2024
GRAMMAR: Grounded and Modular Methodology for Assessment of Closed-Domain Retrieval-Augmented Language Model
Xinzhe Li
Ming Liu
Shang Gao
RALM
40
0
0
30 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
67
45
0
23 Apr 2024
ISQA: Informative Factuality Feedback for Scientific Summarization
Zekai Li
Yanxia Qin
Qian Liu
Min-Yen Kan
HILM
32
1
0
20 Apr 2024
AmbigDocs: Reasoning across Documents on Different Entities under the Same Name
Yoonsang Lee
Xi Ye
Eunsol Choi
38
5
0
18 Apr 2024
Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models
Sunhao Dai
Chen Xu
Shicheng Xu
Liang Pang
Zhenhua Dong
Jun Xu
42
59
0
17 Apr 2024
FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document
Joonho Yang
Seunghyun Yoon
Byeongjeong Kim
Hwanhee Lee
HILM
26
3
0
17 Apr 2024
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
Liyan Tang
Philippe Laban
Greg Durrett
HILM
SyDa
37
74
0
16 Apr 2024
NoticIA: A Clickbait Article Summarization Dataset in Spanish
Iker García-Ferrero
Begoña Altuna
37
2
0
11 Apr 2024
Best Practices and Lessons Learned on Synthetic Data for Language Models
Ruibo Liu
Jerry W. Wei
Fangyu Liu
Chenglei Si
Yanzhe Zhang
...
Steven Zheng
Daiyi Peng
Diyi Yang
Denny Zhou
Andrew M. Dai
SyDa
EgoV
41
85
0
11 Apr 2024
Pitfalls of Conversational LLMs on News Debiasing
Ipek Baris Schlicht
Defne Altiok
Maryanne Taouk
Lucie Flek
24
3
0
09 Apr 2024
Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports
Tianyu Cao
Natraj Raman
Danial Dervovic
Chenhao Tan
35
4
0
09 Apr 2024
Know When To Stop: A Study of Semantic Drift in Text Generation
Ava Spataru
Eric Hambro
Elena Voita
Nicola Cancedda
26
3
0
08 Apr 2024
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Liqiang Jing
Xinya Du
71
17
0
07 Apr 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Derui Zhu
Dingfan Chen
Qing Li
Zongxiong Chen
Lei Ma
Jens Grossklags
Mario Fritz
HILM
35
8
0
06 Apr 2024
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang
Marc Marone
Tianjian Li
Benjamin Van Durme
Daniel Khashabi
85
9
0
05 Apr 2024
Evaluating LLMs at Detecting Errors in LLM Responses
Ryo Kamoi
Sarkar Snigdha Sarathi Das
Renze Lou
Jihyun Janice Ahn
Yilun Zhao
...
Salika Dave
Shaobo Qin
Arman Cohan
Wenpeng Yin
Rui Zhang
42
19
0
04 Apr 2024
PRobELM: Plausibility Ranking Evaluation for Language Models
Moy Yuan
Chenxi Whitehouse
Eric Chamoun
Rami Aly
Andreas Vlachos
81
4
0
04 Apr 2024
HyperCLOVA X Technical Report
Kang Min Yoo
Jaegeun Han
Sookyo In
Heewon Jeon
Jisu Jeong
...
Hyunkyung Noh
Se-Eun Choi
Sang-Woo Lee
Jung Hwa Lim
Nako Sung
VLM
27
8
0
02 Apr 2024
On the Role of Summary Content Units in Text Summarization Evaluation
Marcel Nawrath
Agnieszka Nowak
Tristan Ratz
Danilo C. Walenta
Juri Opitz
...
Sebastian Gehrmann
Saad Mahamood
Miruna Clinciu
Khyathi Raghavi Chandu
Yufang Hou
ELM
21
5
0
02 Apr 2024
AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysis
Natalia Griogoriadou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
VLM
22
0
0
01 Apr 2024
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks
Hyunjae Kim
Hyeon Hwang
Jiwoo Lee
Sihyeon Park
Dain Kim
Taewhoo Lee
Chanwoong Yoon
Jiwoong Sohn
Donghee Choi
Jaewoo Kang
ELM
AI4MH
LRM
48
16
0
30 Mar 2024
Is Factuality Decoding a Free Lunch for LLMs? Evaluation on Knowledge Editing Benchmark
Baolong Bi
Shenghua Liu
Yiwei Wang
Lingrui Mei
Xueqi Cheng
KELM
41
9
0
30 Mar 2024
LUQ: Long-text Uncertainty Quantification for LLMs
Caiqi Zhang
Fangyu Liu
Marco Basaldella
Nigel Collier
HILM
50
24
0
29 Mar 2024
FACTOID: FACtual enTailment fOr hallucInation Detection
Vipula Rawte
S. M. Towhidul
Krishnav Rajbangshi
Shravani Nag
Aman Chadha
Amit P. Sheth
Amitava Das
HILM
28
3
0
28 Mar 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
29
32
0
27 Mar 2024
Previous
1
2
3
...
10
5
6
7
8
9
Next