Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.16739
Cited By
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
26 May 2023
Yuheng Zha
Yichi Yang
Ruichen Li
Zhiting Hu
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AlignScore: Evaluating Factual Consistency with a Unified Alignment Function"
30 / 30 papers shown
Title
Towards Long Context Hallucination Detection
Siyi Liu
Kishaloy Halder
Zheng Qi
Wei Xiao
Nikolaos Pappas
Phu Mon Htut
Neha Anna John
Yassine Benajiba
Dan Roth
HILM
73
0
0
28 Apr 2025
Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language Models
Shiran Dudy
Thulasi Tholeti
R. Ramachandranpillai
Muhammad Ali
Toby Jia-Jun Li
Ricardo Baeza-Yates
27
0
0
16 Mar 2025
Leveraging Retrieval Augmented Generative LLMs For Automated Metadata Description Generation to Enhance Data Catalogs
Mayank Singh
Abhijeet Kumar
Sasidhar Donaparthi
Gayatri Karambelkar
43
0
0
12 Mar 2025
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
Yi-Ling Chung
Aurora Cobo
Pablo Serna
SyDa
HILM
58
0
0
24 Feb 2025
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
Song Duong
Florian Le Bronnec
Alexandre Allauzen
Vincent Guigue
Alberto Lumbreras
Laure Soulier
Patrick Gallinari
HILM
43
0
0
20 Feb 2025
MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training
Xinxin You
Xien Liu
Qixin Sun
Huan Zhang
Kaiyin Zhou
Shaohui Liu
Guoping Hu
Shijin Wang
Si Liu
Ji Wu
83
0
0
13 Feb 2025
Context-Aware Hierarchical Merging for Long Document Summarization
Litu Ou
Mirella Lapata
MoMe
146
1
0
03 Feb 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge
Aparna Elangovan
Jongwoo Ko
Lei Xu
Mahsa Elyasi
Ling Liu
S. Bodapati
Dan Roth
41
5
0
28 Jan 2025
RELexED: Retrieval-Enhanced Legal Summarization with Exemplar Diversity
T. Y. S. S. Santosh
Chen Jia
Patrick Goroncy
Matthias Grabmair
AILaw
44
1
0
23 Jan 2025
Decomposition Dilemmas: Does Claim Decomposition Boost or Burden Fact-Checking Performance?
Qisheng Hu
Quanyu Long
Wenya Wang
87
5
0
17 Oct 2024
Analysing Zero-Shot Readability-Controlled Sentence Simplification
Abdullah Barayan
Jose Camacho-Collados
Fernando Alva-Manchego
29
1
0
30 Sep 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
75
5
0
11 Sep 2024
UF-HOBI at "Discharge Me!": A Hybrid Solution for Discharge Summary Generation Through Prompt-based Tuning of GatorTronGPT Models
Mengxian Lyu
C.A.I. Peng
Daniel Paredes
Ziyi Chen
Aokun Chen
Jiang Bian
Yonghui Wu
19
2
0
22 Jul 2024
STORYSUMM: Evaluating Faithfulness in Story Summarization
Melanie Subbiah
Faisal Ladhak
Akankshya Mishra
Griffin Adams
Lydia B. Chilton
Kathleen McKeown
34
4
0
09 Jul 2024
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban
Alexander R. Fabbri
Caiming Xiong
Chien-Sheng Wu
RALM
38
41
0
01 Jul 2024
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection
Jooyoung Lee
Toshini Agrawal
Adaku Uchendu
Thai V. Le
Jinghui Chen
Dongwon Lee
31
1
0
24 Jun 2024
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin
Ekaterina Fadeeva
Artem Vazhentsev
Akim Tsvigun
Daniil Vasilev
...
Timothy Baldwin
Timothy Baldwin
Maxim Panov
Artem Shelmanov
Artem Shelmanov
HILM
64
8
0
21 Jun 2024
Factual Dialogue Summarization via Learning from Large Language Models
Rongxin Zhu
Jey Han Lau
Jianzhong Qi
HILM
46
1
0
20 Jun 2024
Unlearning Climate Misinformation in Large Language Models
Michael Fore
Simranjit Singh
Chaehong Lee
Amritanshu Pandey
Antonios Anastasopoulos
Dimitrios Stamoulis
MU
52
1
0
29 May 2024
TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
Jaewoo Ahn
Taehyun Lee
Junyoung Lim
Jin-Hwa Kim
Sangdoo Yun
Hwaran Lee
Gunhee Kim
LLMAG
HILM
35
12
0
28 May 2024
WisPerMed at "Discharge Me!": Advancing Text Generation in Healthcare with Large Language Models, Dynamic Expert Selection, and Priming Techniques on MIMIC-IV
Hendrik Damm
T. M. G. Pakull
Bahadir Eryilmaz
Helmut Becker
Ahmad Idrissi-Yaghir
Henning Schafer
Sergej Schultenkämper
Christoph M. Friedrich
26
3
0
18 May 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
57
9
0
25 Mar 2024
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
Xinpeng Wang
Shitong Duan
Xiaoyuan Yi
Jing Yao
Shanlin Zhou
Zhihua Wei
Peng Zhang
Dongkuan Xu
Maosong Sun
Xing Xie
OffRL
33
16
0
07 Mar 2024
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
41
519
0
03 Sep 2023
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Shibo Hao
Tianyang Liu
Zhen Wang
Zhiting Hu
RALM
LLMAG
35
172
0
19 May 2023
MaskEval: Weighted MLM-Based Evaluation for Text Summarization and Simplification
Yu Lu Liu
Rachel Bawden
Thomas Scaliom
Benoît Sagot
Jackie C.K. Cheung
28
4
0
24 May 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
211
1,656
0
15 Oct 2021
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Nouha Dziri
Hannah Rashkin
Tal Linzen
David Reitter
ALM
185
79
0
30 Apr 2021
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
215
305
0
27 Apr 2021
GO FIGURE: A Meta Evaluation of Factuality in Summarization
Saadia Gabriel
Asli Celikyilmaz
Rahul Jha
Yejin Choi
Jianfeng Gao
HILM
233
96
0
24 Oct 2020
1