Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.16739
Cited By
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
26 May 2023
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AlignScore: Evaluating Factual Consistency with a Unified Alignment Function"
50 / 182 papers shown
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
Manveer Singh Tamber
F. S. Bao
Chenyu Xu
Ge Luo
Suleman Kazi
Minseok Bae
Miaoran Li
Ofer Mendelevitch
Renyi Qu
Jimmy J. Lin
VLM
392
8
0
07 May 2025
Towards Long Context Hallucination Detection
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Siyi Liu
Kishaloy Halder
Zheng Qi
Wei Xiao
Nikolaos Pappas
Phu Mon Htut
Neha Anna John
Yassine Benajiba
Dan Roth
HILM
288
12
0
28 Apr 2025
Evaluating Evaluation Metrics -- The Mirage of Hallucination Detection
Atharva Kulkarni
Yuan-kang Zhang
Joel Ruben Antony Moniz
Xiou Ge
Bo-Hsiang Tseng
Dhivya Piraviperumal
Siyang Song
Hong-ye Yu
HILM
379
5
0
25 Apr 2025
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Aviv Slobodkin
Hagai Taitelbaum
Yonatan Bitton
Brian Gordon
Michal Sokolik
Nitzan Bitton-Guetta
Almog Gueta
Royi Rassin
Itay Laish
Dani Lischinski
EGVM
VGen
397
1
0
24 Apr 2025
Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Andrea Santilli
Adam Goliñski
Michael Kirchhof
Federico Danieli
Arno Blaas
Miao Xiong
Luca Zappella
Sinead Williamson
279
9
0
18 Apr 2025
Exploration of Plan-Guided Summarization for Narrative Texts: the Case of Small Language Models
Matt Grenander
Siddharth Varia
Paula Czarnowska
Yogarshi Vyas
Kishaloy Halder
Bonan Min
HILM
297
1
0
12 Apr 2025
YaleNLP @ PerAnsSumm 2025: Multi-Perspective Integration via Mixture-of-Agents for Enhanced Healthcare QA Summarization
Dongsuk Jang
Alan Li
Arman Cohan
268
1
0
04 Apr 2025
WikiVideo: Article Generation from Multiple Videos
Alexander Martin
Reno Kriz
William Walden
Kate Sanders
Hannah Recknor
Eugene Yang
Francis Ferraro
Benjamin Van Durme
DiffM
VGen
423
3
0
01 Apr 2025
TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes
Raj Sanjay Shah
Lei Xu
Qianchu Liu
Jon Burnsky
Drew Bertagnolli
Chaitanya P. Shivade
LM&MA
312
1
0
26 Mar 2025
MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
David Wan
Justin Chih-Yao Chen
Elias Stengel-Eskin
Joey Tianyi Zhou
LLMAG
LRM
275
5
0
19 Mar 2025
Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language Models
International Conference on Intelligent User Interfaces (IUI), 2025
Shiran Dudy
Thulasi Tholeti
R. Ramachandranpillai
Muhammad Ali
Toby Jia-Jun Li
Ricardo Baeza-Yates
308
8
0
16 Mar 2025
Leveraging Retrieval Augmented Generative LLMs For Automated Metadata Description Generation to Enhance Data Catalogs
Mayank Singh
Abhijeet Kumar
Sasidhar Donaparthi
Gayatri Karambelkar
189
4
0
12 Mar 2025
Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data
Swati Rallapalli
Shannon Gallagher
Andrew O. Mellinger
Jasmine Ratchford
Anusha Sinha
Tyler Brooks
William R. Nichols
Nick Winski
Bryan Brown
175
1
0
10 Mar 2025
Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Keliang Li
Tianhua Zhang
Yunxiang Li
Hongyin Luo
Abdalla Moustafa
Xixin Wu
James Glass
Helen Meng
289
6
0
03 Mar 2025
Parameter-free Video Segmentation for Vision and Language Understanding
Louis Mahon
Mirella Lapata
VLM
273
4
0
03 Mar 2025
Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of Summarization
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Siya Qi
Rui Cao
Petr Slovak
Zheng Yuan
HILM
314
2
0
03 Mar 2025
Towards Conditioning Clinical Text Generation for User Control
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Osman Alperen Koras
Rabi Bahnan
Jens Kleesiek
Amin Dada
189
1
0
24 Feb 2025
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
Yi-Ling Chung
Aurora Cobo
Pablo Serna
SyDa
HILM
229
6
0
24 Feb 2025
GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yingjian Chen
Haoran Liu
Yinhong Liu
Rui Yang
Han Yuan
...
Pengyuan Zhou
Peng Yuan Zhou
Qingyu Chen
James Caverlee
Irene Li
HILM
569
7
0
23 Feb 2025
Position: Beyond Assistance - Reimagining LLMs as Ethical and Adaptive Co-Creators in Mental Health Care
Abeer Badawi
Md Tahmid Rahman Laskar
J. Huang
Shaina Raza
Elham Dolatabadi
AI4MH
242
0
0
21 Feb 2025
PeerQA: A Scientific Question Answering Dataset from Peer Reviews
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Tim Baumgärtner
Ted Briscoe
Iryna Gurevych
216
7
0
20 Feb 2025
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
International Conference on Learning Representations (ICLR), 2025
Song Duong
Florian Le Bronnec
Alexandre Allauzen
Vincent Guigue
Alberto Lumbreras
Laure Soulier
Patrick Gallinari
HILM
265
3
0
20 Feb 2025
Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models
Artyom Kharinaev
Viktor Moskvoretskii
Egor Shvetsov
Kseniia Studenikina
Bykov Mikhail
Evgeny Burnaev
MQ
343
5
0
18 Feb 2025
Evaluating Step-by-step Reasoning Traces: A Survey
Jinu Lee
Anjali Narayan-Chen
LRM
ELM
524
19
0
17 Feb 2025
Factual Inconsistency in Data-to-Text Generation Scales Exponentially with LLM Size: A Statistical Validation
Joy Mahapatra
Soumyajit Roy
Utpal Garain
HILM
ALM
303
0
0
17 Feb 2025
MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training
Xinxin You
Xien Liu
Qixin Sun
Huan Zhang
Kaiyin Zhou
Shaohui Liu
Guoping Hu
Shijin Wang
Si Liu
Ji Wu
408
0
0
13 Feb 2025
Context-Aware Hierarchical Merging for Long Document Summarization
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Litu Ou
Mirella Lapata
MoMe
1.1K
3
0
03 Feb 2025
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Deren Lei
Yaxi Li
Siyao Li
Mengya Hu
Rui Xu
Ken Archer
Mingyu Wang
Emily Ching
Alex Deng
SyDa
HILM
LRM
263
6
0
28 Jan 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge
International Conference on Learning Representations (ICLR), 2024
Aparna Elangovan
Jongwoo Ko
Lei Xu
Mahsa Elyasi
Ling Liu
S. Bodapati
Dan Roth
262
21
0
28 Jan 2025
RELexED: Retrieval-Enhanced Legal Summarization with Exemplar Diversity
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
T. Y. S. S. Santosh
Chen Jia
Patrick Goroncy
Matthias Grabmair
AILaw
236
3
0
23 Jan 2025
CoPERLex: Content Planning with Event-based Representations for Legal Case Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
T. Y. S. S. Santosh
Youssef Farag
Matthias Grabmair
AILaw
ELM
244
2
0
23 Jan 2025
Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jeonghwan Kim
Heng Ji
MLLM
278
4
0
08 Jan 2025
Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication Use
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Mohit Chandra
Siddharth Sriraman
Gaurav Verma
Harneet Singh Khanuja
Jose Suarez Campayo
Zihang Li
Michael L. Birnbaum
M. D. Choudhury
AI4MH
338
13
0
08 Jan 2025
SummExecEdit: A Factual Consistency Benchmark in Summarization with Executable Edits
Onkar Thorat
Philippe Laban
Chien-Sheng Wu
HILM
370
1
0
17 Dec 2024
QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization
Shiyue Zhang
David Wan
Arie Cattan
Ayal Klein
Ido Dagan
Joey Tianyi Zhou
354
4
0
10 Dec 2024
An Extensive Evaluation of Factual Consistency in Large Language Models for Data-to-Text Generation
Joy Mahapatra
Utpal Garain
HILM
ALM
390
2
0
28 Nov 2024
Do Automatic Factuality Metrics Measure Factuality? A Critical Evaluation
S. Ramprasad
Byron C. Wallace
LLMAG
HILM
628
8
0
25 Nov 2024
Bayesian Calibration of Win Rate Estimation with LLM Evaluators
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yicheng Gao
G. Xu
Zhe Wang
Arman Cohan
245
8
0
07 Nov 2024
RAGulator: Lightweight Out-of-Context Detectors for Grounded Text Generation
Ian Poey
Jiajun Liu
Qishuai Zhong
Adrien Chenailler
269
0
0
06 Nov 2024
Summarization of Opinionated Political Documents with Varied Perspectives
International Conference on Computational Linguistics (COLING), 2024
Nicholas Deas
Kathleen McKeown
282
1
0
06 Nov 2024
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
Junda Wu
Xintong Li
Ruoyu Wang
Yu Xia
Yuxin Xiong
...
Xiang Chen
Branislav Kveton
Lina Yao
Jingbo Shang
Julian McAuley
OffRL
LRM
229
4
0
31 Oct 2024
On Positional Bias of Faithfulness for Long-form Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
David Wan
Jesse Vig
Joey Tianyi Zhou
Shafiq Joty
HILM
256
16
0
31 Oct 2024
Retrieval-Augmented Generation with Estimation of Source Reliability
Jeongyeon Hwang
Junyoung Park
Hyejin Park
Dongwoo Kim
Sangdon Park
Jungseul Ok
RALM
462
4
0
30 Oct 2024
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum
Nitay Calderon
Orgad Keller
Idan Szpektor
Roi Reichart
264
11
0
24 Oct 2024
Cross-Document Event-Keyed Summarization
William Walden
Pavlo Kuchmiichuk
Alexander Martin
Chihsheng Jin
Angela Cao
Claire Sun
Curisia Allen
Aaron Steven White
RALM
177
0
0
18 Oct 2024
ScreenWriter: Automatic Screenplay Generation and Movie Summarisation
Louis Mahon
Mirella Lapata
215
5
0
17 Oct 2024
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
F. S. Bao
Miaoran Li
Renyi Qu
Ge Luo
Erana Wan
...
Ruixuan Tu
Chenyu Xu
Matthew Gonzales
Ofer Mendelevitch
Amin Ahmad
VLM
HILM
243
15
0
17 Oct 2024
Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance?
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Qisheng Hu
Quanyu Long
Wenya Wang
933
21
0
17 Oct 2024
A Little Human Data Goes A Long Way
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Dhananjay Ashok
Jonathan May
SyDa
528
6
0
17 Oct 2024
Unlocking Legal Knowledge: A Multilingual Dataset for Judicial Summarization in Switzerland
Luca Rolshoven
Vishvaksenan Rasiah
Srinanda Brügger Bose
Sarah Hostettler
Lara Burkhalter
Matthias Sturmer
Joel Niklaus
ELM
AILaw
273
4
0
17 Oct 2024
Previous
1
2
3
4
Next