ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.16739
  4. Cited By
AlignScore: Evaluating Factual Consistency with a Unified Alignment
  Function

AlignScore: Evaluating Factual Consistency with a Unified Alignment Function

Annual Meeting of the Association for Computational Linguistics (ACL), 2023
26 May 2023
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
    HILM
ArXiv (abs)PDFHTMLGithub (1885★)

Papers citing "AlignScore: Evaluating Factual Consistency with a Unified Alignment Function"

50 / 184 papers shown
Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation
Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation
Elena V. Epure
Yashar Deldjoo
Bruno Sguerra
Markus Schedl
Manuel Moussallam
218
0
0
20 Nov 2025
HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models
HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models
Sushant Gautam
Michael A. Riegler
Pål Halvorsen
VLM
254
3
0
16 Nov 2025
SynClaimEval: A Framework for Evaluating the Utility of Synthetic Data in Long-Context Claim Verification
SynClaimEval: A Framework for Evaluating the Utility of Synthetic Data in Long-Context Claim Verification
Mohamed Elaraby
Jyoti Prakash Maheswari
SyDa
144
0
0
12 Nov 2025
Stress Testing Factual Consistency Metrics for Long-Document Summarization
Stress Testing Factual Consistency Metrics for Long-Document Summarization
Zain Muhammad Mujahid
Dustin Wright
Isabelle Augenstein
HILM
260
0
0
10 Nov 2025
VISTA: Verification In Sequential Turn-based Assessment
VISTA: Verification In Sequential Turn-based Assessment
A. Lewis
Andrew Perrault
Eric Fosler-Lussier
Michael White
HILM
346
0
0
30 Oct 2025
Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation
Seeing Through the MiRAGE: Evaluating Multimodal Retrieval Augmented Generation
Alexander Martin
William Walden
Reno Kriz
Dengjia Zhang
Kate Sanders
Eugene Yang
Chihsheng Jin
Benjamin Van Durme
178
0
0
28 Oct 2025
Designing and Evaluating Chain-of-Hints for Scientific Question Answering
Designing and Evaluating Chain-of-Hints for Scientific Question Answering
Anubhav Jangra
Smaranda Muresan
AI4EdELM
378
0
0
24 Oct 2025
ECG-LLM-- training and evaluation of domain-specific large language models for electrocardiography
ECG-LLM-- training and evaluation of domain-specific large language models for electrocardiography
Lara Ahrens
Wilhelm Haverkamp
Nils Strodthoff
198
0
0
21 Oct 2025
Disparities in Multilingual LLM-Based Healthcare Q&A
Disparities in Multilingual LLM-Based Healthcare Q&A
Ipek Baris Schlicht
Burcu Sayin
Zhixue Zhao
Frederik M. Labonté
Cesare Barbera
Marco Viviani
Paolo Rosso
Lucie Flek
167
2
0
20 Oct 2025
A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation
A Multilingual, Large-Scale Study of the Interplay between LLM Safeguards, Personalisation, and Disinformation
João A. Leite
Arnav Arora
Silvia Gargova
João Luz
Gustavo Sampaio
Ian Roberts
Carolina Scarton
Kalina Bontcheva
260
1
0
14 Oct 2025
Enhancing Faithfulness in Abstractive Summarization via Span-Level Fine-Tuning
Enhancing Faithfulness in Abstractive Summarization via Span-Level Fine-Tuning
Sicong Huang
Qianqi Yan
Shengze Wang
Ian Lane
HILM
205
0
0
10 Oct 2025
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
LeMAJ (Legal LLM-as-a-Judge): Bridging Legal Reasoning and LLM Evaluation
Joseph Enguehard
Morgane Van Ermengem
Kate Atkinson
Sujeong Cha
Arijit Ghosh Chowdhury
...
Jeremy Roghair
Hannah R Marlowe
Carina Suzana Negreanu
Kitty Boxall
Diana Mincu
AILawELM
214
3
0
08 Oct 2025
Exposing Citation Vulnerabilities in Generative Engines
Exposing Citation Vulnerabilities in Generative Engines
Riku Mochizuki
Shusuke Komatsu
Souta Noguchi
Kazuto Ataka
ELM
220
0
0
08 Oct 2025
Text2Stories: Evaluating the Alignment Between Stakeholder Interviews and Generated User Stories
Text2Stories: Evaluating the Alignment Between Stakeholder Interviews and Generated User Stories
Francesco Dente
Fabiano Dalpiaz
Paolo Papotti
94
0
0
08 Oct 2025
Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
Elle
ALM
210
2
0
07 Oct 2025
Addressing Pitfalls in the Evaluation of Uncertainty Estimation Methods for Natural Language Generation
Addressing Pitfalls in the Evaluation of Uncertainty Estimation Methods for Natural Language Generation
Mykyta Ielanskyi
Kajetan Schweighofer
L. Aichberger
Sepp Hochreiter
HILM
266
2
0
02 Oct 2025
Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization
Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization
Sarvesh Soni
Dina Demner-Fushman
AI4MH
251
0
0
01 Oct 2025
Copy-Paste to Mitigate Large Language Model Hallucinations
Copy-Paste to Mitigate Large Language Model Hallucinations
Yongchao Long
Xian Wu
Yingying Zhang
Xianbin Wen
Yuxi Zhou
Zhiqin Jiang
199
0
0
01 Oct 2025
ReEvalMed: Rethinking Medical Report Evaluation by Aligning Metrics with Real-World Clinical Judgment
ReEvalMed: Rethinking Medical Report Evaluation by Aligning Metrics with Real-World Clinical Judgment
Ruochen Li
Jun Li
Bailiang Jian
Kun Yuan
Youxiang Zhu
198
3
0
30 Sep 2025
Multidimensional Uncertainty Quantification via Optimal Transport
Multidimensional Uncertainty Quantification via Optimal Transport
Nikita Kotelevskii
Maiya Goloburda
Vladimir Kondratyev
Alexander Fishkov
Mohsen Guizani
Eric Moulines
Maxim Panov
233
1
0
26 Sep 2025
Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models
Decoding Uncertainty: The Impact of Decoding Strategies for Uncertainty Estimation in Large Language Models
Wataru Hashimoto
Hidetaka Kamigaito
Taro Watanabe
213
3
0
20 Sep 2025
Pluralistic Off-policy Evaluation and Alignment
Pluralistic Off-policy Evaluation and Alignment
Chengkai Huang
Junda Wu
Zhouhang Xie
Yu Xia
Rui Wang
Tong Yu
Subrata Mitra
Julian McAuley
L. Yao
OffRL
221
4
0
15 Sep 2025
Automated Evidence Extraction and Scoring for Corporate Climate Policy Engagement: A Multilingual RAG Approach
Automated Evidence Extraction and Scoring for Corporate Climate Policy Engagement: A Multilingual RAG Approach
Imene Kolli
Ario Saeid Vaghefi
Chiara Colesanti-Senni
Shantam Raj
Markus Leippold
85
0
0
10 Sep 2025
CoCoA: Confidence and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models
CoCoA: Confidence and Context-Aware Adaptive Decoding for Resolving Knowledge Conflicts in Large Language Models
Anant Khandelwal
Manish Gupta
Puneet Agrawal
268
3
0
25 Aug 2025
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
Hyeyeon Kim
Sungwoo Han
Jingun Kwon
Hidetaka Kamigaito
Manabu Okumura
137
0
0
24 Aug 2025
If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
Shubhashis Roy Dipta
Francis Ferraro
AAML
206
0
0
22 Aug 2025
Expert Preference-based Evaluation of Automated Related Work Generation
Expert Preference-based Evaluation of Automated Related Work Generation
Furkan Şahinuç
Subhabrata Dutta
Iryna Gurevych
159
4
0
11 Aug 2025
CoCoLex: Confidence-guided Copy-based Decoding for Grounded Legal Text Generation
CoCoLex: Confidence-guided Copy-based Decoding for Grounded Legal Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Santosh T.Y.S.S
Youssef Tarek Elkhayat
Oana Ichim
Pranav Shetty
Dongsheng Wang
Zhiqiang Ma
Armineh Nourbakhsh
Xiaomo Liu
169
4
0
07 Aug 2025
The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs
The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs
Denis Janiak
Jakub Binkowski
Albert Sawczyn
Bogdan Gabrys
Ravid Schwartz-Ziv
Tomasz Kajdanowicz
HILM
325
11
0
01 Aug 2025
Hallucination Detection and Mitigation with Diffusion in Multi-Variate Time-Series Foundation Models
Hallucination Detection and Mitigation with Diffusion in Multi-Variate Time-Series Foundation Models
Vijja Wichitwechkarn
Charles Fox
Ruchi Choudhary
AI4TS
243
0
0
23 Jul 2025
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zhenliang Zhang
Xinyu Hu
Huixuan Zhang
Junzhe Zhang
Xiaojun Wan
HILM
405
12
0
22 Jul 2025
KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis
KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis
Reilly Haskins
Benjamin Adams
208
2
0
05 Jul 2025
MedVAL: Toward Expert-Level Medical Text Validation with Language Models
MedVAL: Toward Expert-Level Medical Text Validation with Language Models
Asad Aali
Vasiliki Bikia
M. Varma
Nicole Chiou
Sophie Ostmeier
...
R. Daneshjou
Jason Hom
Sanmi Koyejo
Emily Alsentzer
Akshay Chaudhari
LM&MAELM
411
3
0
03 Jul 2025
Reranking-based Generation for Unbiased Perspective Summarization
Reranking-based Generation for Unbiased Perspective SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Narutatsu Ri
Nicholas Deas
Kathleen McKeown
OffRL
204
0
0
19 Jun 2025
Re-Initialization Token Learning for Tool-Augmented Large Language Models
Re-Initialization Token Learning for Tool-Augmented Large Language Models
Chenghao Li
Liu Liu
B. Yu
Jiayan Qiu
Yibing Zhan
LLMAGCLLKELM
188
0
0
17 Jun 2025
Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering
Neural at ArchEHR-QA 2025: Agentic Prompt Optimization for Evidence-Grounded Clinical Question Answering
Sai Prasanna Teja Reddy Bogireddy
Abrar Majeedi
Viswanatha Reddy Gajjala
Zhuoyan Xu
Siddhant Rai
Vaishnav Potlapalli
355
1
0
12 Jun 2025
CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection
CLATTER: Comprehensive Entailment Reasoning for Hallucination Detection
Ron Eliav
Arie Cattan
Eran Hirsch
Shahaf Bassan
Elias Stengel-Eskin
Mohit Bansal
Ido Dagan
LRM
391
4
0
05 Jun 2025
A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization
A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization
Sarvesh Soni
Dina Demner-Fushman
326
14
0
04 Jun 2025
QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question Answering
QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
A. Tang
Xiuzhen Zhang
M. Dinh
Zhuang Li
RALM
313
0
0
04 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Shouqing Yang
Seongheon Park
Ling Chen
AAMLHILM
299
1
0
03 Jun 2025
Towards Multi-dimensional Evaluation of LLM Summarization across Domains and Languages
Towards Multi-dimensional Evaluation of LLM Summarization across Domains and LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Hyangsuk Min
Yuho Lee
Minjeong Ban
Jiaqi Deng
Nicole Hee-Yeon Kim
Taewon Yun
Hang Su
Jason (Jinglun) Cai
Hwanjun Song
ELM
305
8
0
31 May 2025
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation
Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation
Ekaterina Fadeeva
Aleksandr Rubashevskii
Roman Vashurin
Shehzaad Dhuliawala
Artem Shelmanov
Timothy Baldwin
Preslav Nakov
Mrinmaya Sachan
Maxim Panov
Maxim Panov
HILMRALM
390
7
0
27 May 2025
VeriTrail: Closed-Domain Hallucination Detection with Traceability
VeriTrail: Closed-Domain Hallucination Detection with Traceability
Dasha Metropolitansky
Jonathan Larson
HILM
346
2
0
27 May 2025
Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs
Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs
Artem Vazhentsev
Abdelrahman Boda Sadallah
Gleb Kuzmin
Ekaterina Fadeeva
Ivan Lazichny
...
Maxim Panov
Timothy Baldwin
Mrinmaya Sachan
Preslav Nakov
Artem Shelmanov
EDLHILM
561
9
0
26 May 2025
Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat
Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat
Pusheng Xu
Xia Gong
Xiaolan Chen
Weiyi Zhang
Jiancheng Yang
Bingjie Yan
Meng Yuan
Yalin Zheng
M. He
Danli Shi
310
2
0
26 May 2025
UNCERTAINTY-LINE: Length-Invariant Estimation of Uncertainty for Large Language Models
UNCERTAINTY-LINE: Length-Invariant Estimation of Uncertainty for Large Language Models
Roman Vashurin
Maiya Goloburda
Preslav Nakov
Maxim Panov
282
1
0
25 May 2025
Retrieval Augmented Generation-based Large Language Models for Bridging Transportation Cybersecurity Legal Knowledge Gaps
Retrieval Augmented Generation-based Large Language Models for Bridging Transportation Cybersecurity Legal Knowledge Gaps
Khandakar Ashrafi Akbar
Md Nahiyan Uddin
Latifur Khan
Trayce Hockstad
Mizanur Rahman
M. Chowdhury
B. Thuraisingham
AILawRALM
461
1
0
23 May 2025
Long-Form Information Alignment Evaluation Beyond Atomic Facts
Long-Form Information Alignment Evaluation Beyond Atomic Facts
Danna Zheng
Mirella Lapata
Jeff Z. Pan
HILM
310
2
0
21 May 2025
LEXam: Benchmarking Legal Reasoning on 340 Law Exams
LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Yu Fan
Jingwei Ni
Jakob Merane
Etienne Salimbeni
Yoan Hermstrüwer
...
Mrinmaya Sachan
Alexander Stremitzer
Christoph Engel
Elliott Ash
Joel Niklaus
AILawELM
664
21
0
19 May 2025
What Are They Talking About? A Benchmark of Knowledge-Grounded Discussion Summarization
What Are They Talking About? A Benchmark of Knowledge-Grounded Discussion Summarization
Weixiao Zhou
Junnan Zhu
Gengyao Li
Xianfu Cheng
Xinnian Liang
Feifei Zhai
Zhiyu Li
ALM
447
0
0
18 May 2025
1234
Next
Page 1 of 4