AlignScore: Evaluating Factual Consistency with a Unified Alignment Function

Annual Meeting of the Association for Computational Linguistics (ACL), 2023

26 May 2023

Papers citing "AlignScore: Evaluating Factual Consistency with a Unified Alignment Function"

50 / 182 papers shown

Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards

392

07 May 2025

Towards Long Context Hallucination DetectionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

288

28 Apr 2025

Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection

Atharva Kulkarni

Yuan-kang Zhang

Joel Ruben Antony Moniz

379

25 Apr 2025

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

397

24 Apr 2025

Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias ResultsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

279

18 Apr 2025

Exploration of Plan-Guided Summarization for Narrative Texts: the Case of Small Language Models

297

12 Apr 2025

YaleNLP @ PerAnsSumm 2025: Multi-Perspective Integration via Mixture-of-Agents for Enhanced Healthcare QA Summarization

Dongsuk Jang

Alan Li

Arman Cohan

268

04 Apr 2025

WikiVideo: Article Generation from Multiple Videos

423

01 Apr 2025

TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes

312

26 Mar 2025

MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent CollaborationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

275

19 Mar 2025

Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language ModelsInternational Conference on Intelligent User Interfaces (IUI), 2025

Shiran Dudy

Thulasi Tholeti

R. Ramachandranpillai

Muhammad Ali

Toby Jia-Jun Li

Ricardo Baeza-Yates

308

16 Mar 2025

Leveraging Retrieval Augmented Generative LLMs For Automated Metadata Description Generation to Enhance Data Catalogs

189

12 Mar 2025

Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data

175

10 Mar 2025

Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-EvolutionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

289

03 Mar 2025

Parameter-free Video Segmentation for Vision and Language Understanding

Louis Mahon

Mirella Lapata

VLM

273

03 Mar 2025

Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

314

03 Mar 2025

Towards Conditioning Clinical Text Generation for User ControlAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

189

24 Feb 2025

Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking

229

24 Feb 2025

GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-CheckingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

569

23 Feb 2025

Position: Beyond Assistance - Reimagining LLMs as Ethical and Adaptive Co-Creators in Mental Health Care

Abeer Badawi

Md Tahmid Rahman Laskar

242

21 Feb 2025

PeerQA: A Scientific Question Answering Dataset from Peer ReviewsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Tim Baumgärtner

Ted Briscoe

Iryna Gurevych

216

20 Feb 2025

SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text GenerationInternational Conference on Learning Representations (ICLR), 2025

265

20 Feb 2025

Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models

343

18 Feb 2025

Evaluating Step-by-step Reasoning Traces: A Survey

Jinu Lee

Anjali Narayan-Chen

LRM ELM

524

17 Feb 2025

Factual Inconsistency in Data-to-Text Generation Scales Exponentially with LLM Size: A Statistical Validation

303

17 Feb 2025

MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training

408

13 Feb 2025

Context-Aware Hierarchical Merging for Long Document SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Litu Ou

Mirella Lapata

MoMe

1.1K

03 Feb 2025

FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop DataNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

263

28 Jan 2025

Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-JudgeInternational Conference on Learning Representations (ICLR), 2024

262

28 Jan 2025

RELexED: Retrieval-Enhanced Legal Summarization with Exemplar DiversityNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

236

23 Jan 2025

CoPERLex: Content Planning with Event-based Representations for Legal Case SummarizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

244

23 Jan 2025

Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Jeonghwan Kim

Heng Ji

MLLM

278

08 Jan 2025

Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication UseNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Mohit Chandra

Siddharth Sriraman

Gaurav Verma

Harneet Singh Khanuja

338

08 Jan 2025

SummExecEdit: A Factual Consistency Benchmark in Summarization with Executable Edits

370

17 Dec 2024

QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization

354

10 Dec 2024

An Extensive Evaluation of Factual Consistency in Large Language Models for Data-to-Text Generation

Joy Mahapatra

Utpal Garain

HILM ALM

390

28 Nov 2024

Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation

S. Ramprasad

Byron C. Wallace

LLMAG HILM

628

25 Nov 2024

Bayesian Calibration of Win Rate Estimation with LLM EvaluatorsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

245

07 Nov 2024

RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation

269

06 Nov 2024

Summarization of Opinionated Political Documents with Varied PerspectivesInternational Conference on Computational Linguistics (COLING), 2024

Nicholas Deas

Kathleen McKeown

282

06 Nov 2024

OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models

Junda Wu

...

Xiang Chen

229

31 Oct 2024

On Positional Bias of Faithfulness for Long-form SummarizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

256

31 Oct 2024

Retrieval-Augmented Generation with Estimation of Source Reliability

462

30 Oct 2024

Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance

264

24 Oct 2024

Cross-Document Event-Keyed Summarization

177

18 Oct 2024

ScreenWriter: Automatic Screenplay Generation and Movie Summarisation

Louis Mahon

Mirella Lapata

215

17 Oct 2024

FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Renyi Qu

...

243

17 Oct 2024

Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance?North American Chapter of the Association for Computational Linguistics (NAACL), 2024

Qisheng Hu

Quanyu Long

Wenya Wang

933

17 Oct 2024

A Little Human Data Goes A Long WayAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Dhananjay Ashok

Jonathan May

SyDa

528

17 Oct 2024

Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland

Luca Rolshoven

Vishvaksenan Rasiah

Srinanda Brügger Bose

273

17 Oct 2024