Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.00896
Cited By
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models
1 March 2024
Kedi Chen
Qin Chen
Jie Zhou
Yishen He
Liang He
HILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models"
12 / 12 papers shown
Title
Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning
Dongmin Park
Zhaofang Qian
Guangxing Han
Ser-Nam Lim
MLLM
28
0
0
15 Mar 2024
Navigating Hallucinations for Reasoning of Unintentional Activities
Shresth Grover
Vibhav Vineet
Y. S. Rawat
LRM
36
1
0
29 Feb 2024
Hallucination Detection and Hallucination Mitigation: An Investigation
Junliang Luo
Tianyu Li
Di Wu
Michael R. M. Jenkin
Steve Liu
Gregory Dudek
HILM
LLMAG
36
20
0
16 Jan 2024
Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks
Aleksander Buszydlik
Karol Dobiczek
Michal Teodor Okoñ
Konrad Skublicki
Philip Lippmann
Jie-jin Yang
LRM
ReLM
16
3
0
30 Dec 2023
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus
Tianhang Zhang
Lin Qiu
Qipeng Guo
Cheng Deng
Yue Zhang
Zheng-Wei Zhang
Cheng Zhou
Xinbing Wang
Luoyi Fu
HILM
54
46
0
22 Nov 2023
Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs
Xue-Yong Fu
Md Tahmid Rahman Laskar
Cheng-Hsiung Chen
TN ShashiBhushan
HILM
ELM
59
17
0
01 Nov 2023
Topic Shift Detection in Chinese Dialogues: Corpus and Benchmark
Jian-Dong Lin
Yaxin Fan
Feng Jiang
Xiaomin Chu
Peifeng Li
32
4
0
02 May 2023
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
210
297
0
26 Apr 2023
Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Nouha Dziri
Hannah Rashkin
Tal Linzen
David Reitter
ALM
185
79
0
30 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
212
140
0
18 Apr 2021
Adding Chit-Chat to Enhance Task-Oriented Dialogues
Kai Sun
Seungwhan Moon
Paul A. Crook
Stephen Roller
Becka Silvert
Bing-Quan Liu
Zhiguang Wang
Honglei Liu
Eunjoon Cho
Claire Cardie
59
66
0
24 Oct 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
393
2,216
0
03 Sep 2019
1