v1v2 (latest)

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

23 May 2023

Pang Wei Koh

Luke Zettlemoyer

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

50 / 615 papers shown

LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language TextsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

469

03 Jan 2025

Evaluate Summarization in Fine-Granularity: Auto Evaluation with LLM

Sree Prasanna Rajagopal

224

31 Dec 2024

ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and UncertaintyAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

388

28 Dec 2024

A Survey of Calibration Process for Black-Box LLMs

346

17 Dec 2024

Attention with Dependency Parsing Augmentation for Fine-Grained AttributionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

289

16 Dec 2024

Coverage-based Fairness in Multi-document SummarizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

410

11 Dec 2024

HalluCana: Fixing LLM Hallucination with A Canary LookaheadNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

311

10 Dec 2024

LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

518

10 Dec 2024

QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization

354

10 Dec 2024

Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning

292

03 Dec 2024

A 2-step Framework for Automated Literary Translation Evaluation: Its Promises and Pitfalls

601

02 Dec 2024

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation ModelsComputer Vision and Pattern Recognition (CVPR), 2024

657

27 Nov 2024

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

...

1.1K

287

25 Nov 2024

Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown

313

24 Nov 2024

Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge

462

14 Nov 2024

Beyond the Safety Bundle: Auditing the Helpful and Harmless DatasetNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

573

12 Nov 2024

FactLens: Benchmarking Fine-Grained Fact VerificationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

576

08 Nov 2024

Measuring short-form factuality in large language models

263

214

07 Nov 2024

Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task

160

04 Nov 2024

Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models

...

Regunathan Radhakrishnan

692

03 Nov 2024

Human-inspired Perspectives: A Survey on AI Long-term Memory

585

01 Nov 2024

The Automated Verification of Textual Claims (AVeriTeC) Shared Task

Michael Schlichtkrull

...

Christos Christodoulopoulos

240

31 Oct 2024

Improving Uncertainty Quantification in Large Language Models via Semantic Embeddings

237

30 Oct 2024

Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models

716

30 Oct 2024

FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality EvaluationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

265

29 Oct 2024

LongReward: Improving Long-context Large Language Models with AI FeedbackAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Juanzi Li

208

28 Oct 2024

Graph-based Uncertainty Metrics for Long-form Language Model Outputs

207

28 Oct 2024

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite LearningInternational Conference on Learning Representations (ICLR), 2024

263

25 Oct 2024

ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems

Ishneet Sukhvinder Singh

Ritvik Aggarwal

Ibrahim Allahverdiyev

938

25 Oct 2024

Improving Model Factuality with Fine-grained Critique-based EvaluatorAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

...

515

24 Oct 2024

Multilingual Hallucination Gaps in Large Language Models

146

23 Oct 2024

Leveraging the Domain Adaptation of Retrieval Augmented Generation Models for Question Answering and Reducing Hallucination

211

23 Oct 2024

Enhancing Answer Attribution for Faithful Text Generation with Large Language ModelsInternational Conference on Knowledge Discovery and Information Retrieval (KDIR), 2024

Juraj Vladika

Luca Mülln

Florian Matthes

219

22 Oct 2024

Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement LearningInternational Conference on Machine Learning (ICML), 2024

176

22 Oct 2024

Self-Explained Keywords Empower Large Language Models for Code Generation

Lishui Fan

Mouxiang Chen

Zhongxin Liu

291

21 Oct 2024

RAC: Efficient LLM Factuality Correction with Retrieval Augmentation

Changmao Li

Jeffrey Flanigan

KELM LRM

256

21 Oct 2024

Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model TrainingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Shahrad Mohammadzadeh

398

20 Oct 2024

BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via CompressionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

316

20 Oct 2024

Cross-Document Event-Keyed Summarization

177

18 Oct 2024

Tell me what I need to know: Exploring LLM-based (Personalized) Abstractive Multi-Source Meeting SummarizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

125

18 Oct 2024

LoGU: Long-form Generation with Uncertainty ExpressionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

588

18 Oct 2024

FIRE: Fact-checking with Iterative Retrieval and VerificationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

454

17 Oct 2024

Cross-Lingual Auto Evaluation for Assessing Multilingual LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Sumanth Doddapaneni

Mohammed Safi Ur Rahman Khan

375

17 Oct 2024

From Single to Multi: How LLMs Hallucinate in Multi-Document SummarizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

349

17 Oct 2024

Probing-RAG: Self-Probing to Guide Language Models in Selective Document RetrievalNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

401

17 Oct 2024

Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance?North American Chapter of the Association for Computational Linguistics (NAACL), 2024

Qisheng Hu

Quanyu Long

Wenya Wang

933

17 Oct 2024

A Claim Decomposition Benchmark for Long-form Answer VerificationChina Conference on Information Retrieval (CIR), 2024

205

16 Oct 2024

Auto-PRE: An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation

...

191

16 Oct 2024

Varying Shades of Wrong: Aligning LLMs with Wrong Answers OnlyInternational Conference on Learning Representations (ICLR), 2024

236

14 Oct 2024

Medico: Towards Hallucination Detection and Correction with Multi-source Evidence FusionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Xinping Zhao

Jifang Wang

Min Zhang

161

14 Oct 2024