v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

23 May 2023

Pang Wei Koh

Luke Zettlemoyer

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 615 papers shown

BookWorm: A Dataset for Character Description and AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Argyrios Papoudakis

Mirella Lapata

Frank Keller

195

14 Oct 2024

Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

497

10 Oct 2024

ReIFE: Re-evaluating Instruction-Following EvaluationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Yixin Liu

Chien-Sheng Wu

Shafiq Joty

Arman Cohan

215

09 Oct 2024

LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple ConstraintsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Thomas Palmeira Ferraz

Nanyun Peng

295

09 Oct 2024

Uncovering Factor Level Preferences to Improve Human-Model AlignmentConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

374

09 Oct 2024

ReFIR: Grounding Large Restoration Models with Retrieval AugmentationNeural Information Processing Systems (NeurIPS), 2024

Taolin Zhang

Bin Chen

208

08 Oct 2024

Why am I seeing this: Democratizing End User Auditing for Online Content RecommendationsACM Symposium on User Interface Software and Technology (UIST), 2024

Luke Cao

Toby Jia-jun Li

249

07 Oct 2024

Realizing Video Summarization from the Path of Language-based Semantic Understanding

Kuan-Chen Mu

Zhi-Yi Chin

Wei-Chen Chiu

169

06 Oct 2024

Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on WikipediaConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Farhan Samir

Chan Young Park

Anjalie Field

Vered Shwartz

Yulia Tsvetkov

172

05 Oct 2024

CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints

Anirudh Atmakuru

Jatin Nainani

Rohith Siddhartha Reddy Bheemreddy

367

05 Oct 2024

ECon: On the Detection and Resolution of Evidence ConflictsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Tengxiao Liu

Yangqiu Song

Yue Zhang

Pengfei Liu

Zheng Zhang

260

05 Oct 2024

FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs

367

03 Oct 2024

Loki: An Open-Source Tool for Fact VerificationInternational Conference on Computational Linguistics (COLING), 2024

Haonan Li

Yuxia Wang

Preslav Nakov

619

02 Oct 2024

Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Shayekh Bin Islam

Md Asib Rahman

K S M Tozammel Hossain

Enamul Hoque

Shafiq Joty

Md. Rizwan Parvez

RALM AIFin LRM VLM

207

02 Oct 2024

FactAlign: Long-form Factuality Alignment of Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Chao-Wei Huang

Yun-Nung Chen

HILM

142

02 Oct 2024

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"International Conference on Learning Representations (ICLR), 2024

619

30 Sep 2024

CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

388

29 Sep 2024

Model-based Preference Optimization in Abstractive Summarization without Human FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

426

27 Sep 2024

HaloScope: Harnessing Unlabeled LLM Generations for Hallucination DetectionNeural Information Processing Systems (NeurIPS), 2024

Xuefeng Du

Chaowei Xiao

Yixuan Li

HILM

254

26 Sep 2024

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer DecompositionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Pritika Ramu

Koustava Goswami

Apoorv Saxena

Balaji Vasan Srinivavsan

289

25 Sep 2024

LINKAGE: Listwise Ranking among Varied-Quality References for Non-Factoid QA Evaluation via LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Jiafeng Guo

261

23 Sep 2024

The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests

157

22 Sep 2024

The Factuality of Large Language Models in the Legal DomainInternational Conference on Information and Knowledge Management (CIKM), 2024

Rajaa El Hamdani

Thomas Bonald

Fragkiskos D. Malliaros

Nils Holzenberger

Fabian M. Suchanek

AILaw HILM

260

18 Sep 2024

LLM-as-a-Judge & Reward Model: What They Can and Cannot Do

336

17 Sep 2024

HALO: Hallucination Analysis and Learning Optimization to Empower LLMs with Retrieval-Augmented Context for Guided Clinical Decision MakingIEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), 2024

451

16 Sep 2024

Gaps or Hallucinations? Gazing into Machine-Generated Legal Analysis for Fine-grained Text Evaluations

244

16 Sep 2024

NovAScore: A New Automated Metric for Evaluating Document Level NoveltyInternational Conference on Computational Linguistics (COLING), 2024

Lin Ai

Ziwei Gong

Harshsaiprasad Deshpande

Alexander Johnson

Emmy Phung

Ahmad Emami

Julia Hirschberg

154

14 Sep 2024

When Context Leads but Parametric Memory Follows in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

355

13 Sep 2024

AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM AgentsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

393

13 Sep 2024

Synthetic continued pretrainingInternational Conference on Learning Representations (ICLR), 2024

348

11 Sep 2024

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question AnsweringInternational Conference on Computational Linguistics (COLING), 2024

423

10 Sep 2024

Enhancing Temporal Understanding in Audio Question Answering for Large Audio Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

300

10 Sep 2024

What is the Role of Small Models in the LLM Era: A Survey

Lihu Chen

Gaël Varoquaux

ALM

777

10 Sep 2024

Hallucination Detection in LLMs: Fast and Memory-Efficient Finetuned Models

Gabriel Y. Arteaga

Thomas B. Schon

Nicolas Pielawski

331

04 Sep 2024

Generating Media Background Checks for Automated Source Critical ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Michael Schlichtkrull

240

01 Sep 2024

ContextCite: Attributing Model Generation to ContextNeural Information Processing Systems (NeurIPS), 2024

Aleksander Madry

356

01 Sep 2024

LoraMap: Harnessing the Power of LoRA Connections

204

29 Aug 2024

Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation

N. E. Kriman

HILM

202

27 Aug 2024

What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation

Dingyi Yang

Qin Jin

407

26 Aug 2024

Claim Verification in the Age of Large Language Models: A Survey

593

26 Aug 2024

SLM Meets LLM: Balancing Latency, Interpretability and Consistency in Hallucination Detection

170

22 Aug 2024

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Xuanwang Zhang

Yunze Song

Yidong Wang

...

Yue Zhang

Shikun Zhang

Qingsong Wen

325

21 Aug 2024

Analysis of Plan-based Retrieval for Grounded Text GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

264

20 Aug 2024

Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models

245

20 Aug 2024

Web Retrieval Agents for Evidence-Based Misinformation Detection

Hao Yu

Zachary Yang

Jean-Francois Godbout

Reihaneh Rabbany

Kellin Pelrine

LLMAG OffRL

262

15 Aug 2024

Zero-shot Factual Consistency Evaluation Across Domains

Raunak Agarwal

HILM

322

07 Aug 2024

DebateQA: Evaluating Question Answering on Debatable Knowledge

223

02 Aug 2024

Misinforming LLMs: vulnerabilities, challenges and opportunities

Jaroslaw Kornowicz

Daniel Geissler

Kirsten Thommes

138

02 Aug 2024

A Course Shared Task on Evaluating LLM Output for Clinical Questions

Doan Nam Long Vu

164

31 Jul 2024

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

263

30 Jul 2024