Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.14251
Cited By
v1
v2 (latest)
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
23 May 2023
Sewon Min
Kalpesh Krishna
Xinxi Lyu
M. Lewis
Anuj Kumar
Pang Wei Koh
Mohit Iyyer
Luke Zettlemoyer
Hannaneh Hajishirzi
HILM
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
50 / 615 papers shown
Title
LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Helia Hashemi
J. Eisner
Corby Rosset
Benjamin Van Durme
Chris Kedzie
426
35
0
03 Jan 2025
Evaluate Summarization in Fine-Granularity: Auto Evaluation with LLM
Dong Yuan
Eti Rastogi
Fen Zhao
Sagar Goyal
Gautam Naik
Sree Prasanna Rajagopal
194
2
0
31 Dec 2024
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Qing Zong
Zhaoxiang Wang
Tianshi Zheng
Xiyu Ren
Yangqiu Song
377
11
0
28 Dec 2024
A Survey of Calibration Process for Black-Box LLMs
Liangru Xie
Hui Liu
Jingying Zeng
Xianfeng Tang
Yan Han
Chen Luo
Jing Huang
Zhen Li
Suhang Wang
Qi He
346
8
0
17 Dec 2024
Attention with Dependency Parsing Augmentation for Fine-Grained Attribution
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Qiang Ding
Lvzhou Luo
Yixuan Cao
Ping Luo
284
4
0
16 Dec 2024
Coverage-based Fairness in Multi-document Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Haoyuan Li
Yusen Zhang
Rui Zhang
Snigdha Chaturvedi
397
2
0
11 Dec 2024
HalluCana: Fixing LLM Hallucination with A Canary Lookahead
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Tianyi Li
Erenay Dayanik
Shubhi Tyagi
Andrea Pierleoni
HILM
285
1
0
10 Dec 2024
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Eunsu Kim
Juyoung Suk
Seungone Kim
Niklas Muennighoff
Dongkwan Kim
Alice Oh
ELM
502
9
0
10 Dec 2024
QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization
Shiyue Zhang
David Wan
Arie Cattan
Ayal Klein
Ido Dagan
Joey Tianyi Zhou
354
4
0
10 Dec 2024
Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning
R. Krishnan
Piyush Khanna
Omesh Tickoo
HILM
288
6
0
03 Dec 2024
A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls
Sheikh Shafayat
Dongkeun Yoon
Woori Jang
Jiwoo Choi
Alice Oh
Seohyon Jung
573
1
0
02 Dec 2024
FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models
Computer Vision and Pattern Recognition (CVPR), 2024
Alice Heiman
Xiaoman Zhang
E. Chen
Sung Eun Kim
Pranav Rajpurkar
HILM
MedIm
642
5
0
27 Nov 2024
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
1.1K
277
0
25 Nov 2024
Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown
Lifu Tu
Rui Meng
Shafiq Joty
Yingbo Zhou
Semih Yavuz
HILM
300
2
0
24 Nov 2024
Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge
Seongmin Lee
Hsiang Hsu
Chun-Fu Chen
Duen Horng
LRM
430
2
0
14 Nov 2024
Beyond the Safety Bundle: Auditing the Helpful and Harmless Dataset
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Khaoula Chehbouni
Jonathan Colaço-Carr
Yash More
Jackie CK Cheung
G. Farnadi
548
7
0
12 Nov 2024
FactLens: Benchmarking Fine-Grained Fact Verification
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Kushan Mitra
Dan Zhang
Sajjadur Rahman
Estevam R. Hruschka
HILM
569
5
0
08 Nov 2024
Measuring short-form factuality in large language models
Jason W. Wei
Nguyen Karina
Hyung Won Chung
Yunxin Joy Jiao
Spencer Papay
Amelia Glaese
John Schulman
W. Fedus
ELM
KELM
HILM
256
210
0
07 Nov 2024
Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task
Hoonick Lee
Mogan Gim
Donghyeon Park
Donghee Choi
Jaewoo Kang
152
0
0
04 Nov 2024
Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models
Aliyah R. Hsu
James Zhu
Zhichao Wang
Bin Bi
Shubham Mehrotra
...
Sougata Chaudhuri
Regunathan Radhakrishnan
S. Asur
Claire Na Cheng
Bin Yu
ALM
LRM
679
1
0
03 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Fan Zhang
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Xinyi Zheng
Miao Liu
Junxiao Shen
561
7
0
01 Nov 2024
The Automated Verification of Textual Claims (AVeriTeC) Shared Task
Michael Schlichtkrull
Yulong Chen
Chenxi Whitehouse
Zhenyun Deng
Mubashara Akhtar
...
Christos Christodoulopoulos
O. Cocarascu
Arpit Mittal
James Thorne
Andreas Vlachos
236
22
0
31 Oct 2024
Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings
Yashvir S. Grewal
Edwin V. Bonilla
Thang D. Bui
UQCV
237
18
0
30 Oct 2024
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models
J. Wu
Tsz Ting Chung
Kai Chen
Dit-Yan Yeung
LRM
VLM
699
6
0
30 Oct 2024
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Farima Fatahi Bayat
Lechen Zhang
Sheza Munir
Lu Wang
HILM
256
17
0
29 Oct 2024
LongReward: Improving Long-context Large Language Models with AI Feedback
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Jing Zhang
Zhongni Hou
Xin Lv
S. Cao
Zhenyu Hou
Yilin Niu
Lei Hou
Yuxiao Dong
Ling Feng
Juanzi Li
OffRL
LRM
202
20
0
28 Oct 2024
Graph-based Uncertainty Metrics for Long-form Language Model Outputs
Mingjian Jiang
Yangjun Ruan
Prasanna Sattigeri
Salim Roukos
Tatsunori Hashimoto
207
11
0
28 Oct 2024
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
International Conference on Learning Representations (ICLR), 2024
Yujian Liu
Shiyu Chang
Tommi Jaakkola
Yang Zhang
255
4
0
25 Oct 2024
ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems
Ishneet Sukhvinder Singh
Ritvik Aggarwal
Ibrahim Allahverdiyev
Muhammad Taha
Aslihan Akalin
Kevin Zhu
Sean O'Brien
914
22
0
25 Oct 2024
Improving Model Factuality with Fine-grained Critique-based Evaluator
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Yiqing Xie
Wenxuan Zhou
Pradyot Prakash
Di Jin
Yuning Mao
...
Sinong Wang
Han Fang
Carolyn Rose
Daniel Fried
Hejia Zhang
HILM
491
12
0
24 Oct 2024
Multilingual Hallucination Gaps in Large Language Models
Cléa Chataigner
Afaf Taik
G. Farnadi
HILM
LRM
130
6
0
23 Oct 2024
Leveraging the Domain Adaptation of Retrieval Augmented Generation Models for Question Answering and Reducing Hallucination
Salman Rakin
Md. A. R. Shibly
Zahin M. Hossain
Zeeshan Khan
Md. Mostofa Akbar
198
5
0
23 Oct 2024
Enhancing Answer Attribution for Faithful Text Generation with Large Language Models
International Conference on Knowledge Discovery and Information Retrieval (KDIR), 2024
Juraj Vladika
Luca Mülln
Florian Matthes
215
0
0
22 Oct 2024
Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning
International Conference on Machine Learning (ICML), 2024
Zongmeng Zhang
Yufeng Shi
Jinhua Zhu
Wengang Zhou
Xiang Qi
Peng Zhang
Haoyang Li
RALM
HILM
176
2
0
22 Oct 2024
Self-Explained Keywords Empower Large Language Models for Code Generation
Lishui Fan
Mouxiang Chen
Zhongxin Liu
286
2
0
21 Oct 2024
RAC: Efficient LLM Factuality Correction with Retrieval Augmentation
Changmao Li
Jeffrey Flanigan
KELM
LRM
240
5
0
21 Oct 2024
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Shahrad Mohammadzadeh
Juan D. Guerra
Marco Bonizzato
Reihaneh Rabbany
Golnoosh Farnadi
HILM
382
2
0
20 Oct 2024
BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Yuankai Li
Jia-Chen Gu
Di Wu
Kai-Wei Chang
Nanyun Peng
RALM
MQ
308
1
0
20 Oct 2024
Cross-Document Event-Keyed Summarization
William Walden
Pavlo Kuchmiichuk
Alexander Martin
Chihsheng Jin
Angela Cao
Claire Sun
Curisia Allen
Aaron Steven White
RALM
164
0
0
18 Oct 2024
Tell me what I need to know: Exploring LLM-based (Personalized) Abstractive Multi-Source Meeting Summarization
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Frederic Kirstein
Terry Ruas
Robert Kratel
Bela Gipp
124
5
0
18 Oct 2024
LoGU: Long-form Generation with Uncertainty Expressions
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruihan Yang
Caiqi Zhang
Zhisong Zhang
Xinting Huang
Sen Yang
Nigel Collier
Dong Yu
Deqing Yang
HILM
583
17
0
18 Oct 2024
FIRE: Fact-checking with Iterative Retrieval and Verification
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Zhuohan Xie
Daniil Vasilev
Yuxia Wang
Fauzan Farooqui
Hasan Iqbal
Dhruv Sahnan
Iryna Gurevych
Preslav Nakov
HILM
452
21
0
17 Oct 2024
Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Sumanth Doddapaneni
Mohammed Safi Ur Rahman Khan
Dilip Venkatesh
Mary Dabre
Anoop Kunchukuttan
Mitesh Khapra
ELM
370
7
0
17 Oct 2024
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Catarina G. Belem
Pouya Pezeskhpour
Hayate Iso
Seiji Maekawa
Nikita Bhutani
Estevam R. Hruschka
HILM
333
11
0
17 Oct 2024
Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Ingeol Baek
Hwan Chang
Byeongjeong Kim
Jimin Lee
Hwanhee Lee
RALM
395
14
0
17 Oct 2024
Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance?
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Qisheng Hu
Quanyu Long
Wenya Wang
916
21
0
17 Oct 2024
A Claim Decomposition Benchmark for Long-form Answer Verification
China Conference on Information Retrieval (CIR), 2024
Zhihao Zhang
Yixing Fan
Ruqing Zhang
Jiafeng Guo
HILM
188
0
0
16 Oct 2024
Auto-PRE: An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation
Junjie Chen
Weihang Su
Zhumin Chu
Haitao Li
Qinyao Ai
...
Jun Zhou
Y. Liu
Min Zhang
Shaoping Ma
Qingyao Ai
169
5
0
16 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
International Conference on Learning Representations (ICLR), 2024
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
229
4
0
14 Oct 2024
Medico: Towards Hallucination Detection and Correction with Multi-source Evidence Fusion
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xinping Zhao
Jindi Yu
Zhenyu Liu
Jifang Wang
Dongfang Li
Yibin Chen
Baotian Hu
Min Zhang
HILM
149
3
0
14 Oct 2024
Previous
1
2
3
...
5
6
7
...
11
12
13
Next