Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2410.02707
Cited By
v1
v2
v3
v4 (latest)
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
International Conference on Learning Representations (ICLR), 2024
3 October 2024
Hadas Orgad
Michael Toker
Zorik Gekhman
Roi Reichart
Idan Szpektor
Hadas Kotek
Yonatan Belinkov
HILM
AIFin
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (49 upvotes)
Papers citing
"LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations"
50 / 131 papers shown
Robust Hallucination Detection in LLMs via Adaptive Token Selection
Mengjia Niu
Hamed Haddadi
Guansong Pang
HILM
450
3
0
10 Apr 2025
Epistemic Alignment: A Mediating Framework for User-LLM Knowledge Delivery
Nicholas Clark
Hua Shen
Bill Howe
Tanushree Mitra
310
10
0
01 Apr 2025
Learning to Instruct for Visual Instruction Tuning
Zhihan Zhou
Feng Hong
Jiaan Luo
Jiangchao Yao
Dongsheng Li
Bo Han
Yujiao Shi
Yanfeng Wang
VLM
421
3
0
28 Mar 2025
A Survey of Large Language Model Agents for Question Answering
Murong Yue
LLMAG
LM&MA
ELM
292
26
0
24 Mar 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
441
22
0
18 Mar 2025
Beyond Next Token Probabilities: Learnable, Fast Detection of Hallucinations and Data Contamination on LLM Output Distributions
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
419
2
0
18 Mar 2025
AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generation
Fengyu Li
Yilin Li
Junhao Zhu
Lu Chen
Yanfei Zhang
Jia Zhou
Hui Zu
Jingwen Zhao
Yunjun Gao
LLMAG
226
0
0
14 Mar 2025
Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation
Beitao Chen
Xinyu Lyu
Lianli Gao
Jingkuan Song
Mengqi Li
528
3
0
11 Mar 2025
Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding Layers
Zicong He
Boxuan Zhang
Lu Cheng
365
5
0
04 Mar 2025
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
Zhenyi Shen
Hanqi Yan
Linhai Zhang
Zhanghao Hu
Yali Du
Yulan He
LRM
637
79
0
28 Feb 2025
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
410
5
0
24 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Boxuan Zhang
Ruqi Zhang
LRM
317
6
0
24 Feb 2025
Confidence Improves Self-Consistency in LLMs
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Amir Taubenfeld
Tom Sheffer
Eran Ofek
Amir Feder
Ariel Goldstein
Zorik Gekhman
G. Yona
LRM
371
14
0
10 Feb 2025
Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Xiaoxue Cheng
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
HILM
LRM
354
0
0
02 Jan 2025
HalluCana: Fixing LLM Hallucination with A Canary Lookahead
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Tianyi Li
Erenay Dayanik
Shubhi Tyagi
Andrea Pierleoni
HILM
314
1
0
10 Dec 2024
VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2024
Kangsan Kim
G. Park
Youngwan Lee
Woongyeong Yeo
Sung Ju Hwang
336
11
0
03 Dec 2024
Toward Automated Validation of Language Model Synthesized Test Cases using Semantic Entropy
Hamed Taherkhani
Jiho Shin
Muhammad Ammar Tahir
Md Rakib Hossain Misu
Vineet Sunil Gattani
Hadi Hemmati
294
5
0
13 Nov 2024
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
International Conference on Learning Representations (ICLR), 2024
Chenxi Wang
Xiang Chen
Ningyu Zhang
Bozhong Tian
Haoming Xu
Shumin Deng
Ningyu Zhang
MLLM
LRM
788
49
0
15 Oct 2024
FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs
Deema Alnuhait
Neeraja Kirtane
Muhammad Khalifa
Hao Peng
LRM
HILM
378
7
0
03 Oct 2024
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Xun Liang
Chenyang Xi
Zifan Zheng
Ding Chen
Qingchen Yu
...
Rong-Hua Li
Peng Cheng
Zhonghao Wang
Feiyu Xiong
Zhiyu Li
HILM
LRM
497
45
0
19 Jul 2024
Truth is Universal: Robust Detection of Lies in LLMs
Lennart Bürger
Fred Hamprecht
B. Nadler
HILM
237
51
0
03 Jul 2024
Estimating Knowledge in Large Language Models Without Generating a Single Token
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Daniela Gottesman
Mor Geva
266
28
0
18 Jun 2024
Detection-Correction Structure via General Language Model for Grammatical Error Correction
Wei Li
Houfeng Wang
298
7
0
28 May 2024
Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?
G. Yona
Roee Aharoni
Mor Geva
HILM
283
50
0
27 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
415
226
0
09 May 2024
Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs
Adi Simhi
Jonathan Herzig
Idan Szpektor
Yonatan Belinkov
HILM
286
20
0
15 Apr 2024
Large Language Models are Contrastive Reasoners
Liang Yao
ReLM
ELM
LRM
358
9
0
13 Mar 2024
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
Yuhong Sun
Zhangyue Yin
Qipeng Guo
Jiawen Wu
Xipeng Qiu
Hui Zhao
159
37
0
06 Mar 2024
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Fan Yin
Jayanth Srinivasa
Kai-Wei Chang
HILM
280
39
0
28 Feb 2024
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection
International Conference on Learning Representations (ICLR), 2024
Chao Chen
Kai-Chun Liu
Ze Chen
Yi Gu
Yue-bo Wu
Mingyuan Tao
Zhihang Fu
Jieping Ye
HILM
313
201
0
06 Feb 2024
Language Writ Large: LLMs, ChatGPT, Grounding, Meaning and Understanding
S. Harnad
214
17
0
03 Feb 2024
On Early Detection of Hallucinations in Factual Question Answering
Ben Snyder
Marius Moisescu
Muhammad Bilal Zafar
HILM
371
51
0
19 Dec 2023
Weakly Supervised Detection of Hallucinations in LLM Activations
Miriam Rateike
C. Cintas
John Wamburu
Tanya Akumu
Skyler Speakman
254
20
0
05 Dec 2023
Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kevin Liu
Stephen Casper
Dylan Hadfield-Menell
Jacob Andreas
HILM
265
51
0
27 Nov 2023
Fine-tuning Language Models for Factuality
International Conference on Learning Representations (ICLR), 2023
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELM
HILM
SyDa
279
241
0
14 Nov 2023
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
441
1,930
0
09 Nov 2023
The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Aviv Slobodkin
Omer Goldman
Avi Caciularu
Ido Dagan
Haiqin Yang
HILM
LRM
252
46
0
18 Oct 2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
475
357
0
10 Oct 2023
The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Vipula Rawte
Swagata Chakraborty
Agnibh Pathak
Anubhav Sarkar
S.M. Towhidul Islam Tonmoy
Vasu Sharma
Mikel Artetxe
Punit Daniel Simig
HILM
314
183
0
08 Oct 2023
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
International Conference on Learning Representations (ICLR), 2023
Mert Yuksekgonul
Varun Chandrasekaran
Erik Jones
Suriya Gunasekar
Ranjita Naik
Hamid Palangi
Ece Kamar
Besmira Nushi
HILM
211
67
0
26 Sep 2023
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
International Conference on Learning Representations (ICLR), 2023
Yung-Sung Chuang
Yujia Xie
Hongyin Luo
Yoon Kim
James R. Glass
Pengcheng He
HILM
286
279
0
07 Sep 2023
Gender bias and stereotypes in Large Language Models
International Conference on Climate Informatics (ICCI), 2023
Hadas Kotek
Rikker Dockum
David Q. Sun
354
354
0
28 Aug 2023
Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models
IEEE Transactions on Software Engineering (TSE), 2023
Yuheng Huang
Yuheng Huang
Zhijie Wang
Shengming Zhao
Huaming Chen
Felix Juefei-Xu
Lei Ma
304
34
0
16 Jul 2023
Personality Traits in Large Language Models
Gregory Serapio-García
Mustafa Safdari
Clément Crepy
Luning Sun
Stephen Fitz
P. Romero
Marwa Abdulhai
Aleksandra Faust
Maja J. Matarić
LM&MA
LLMAG
710
180
0
01 Jul 2023
Still No Lie Detector for Language Models: Probing Empirical and Conceptual Roadblocks
Philosophical Studies (Philos. Stud.), 2023
B. Levinstein
Daniel A. Herrmann
260
81
0
30 Jun 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Neural Information Processing Systems (NeurIPS), 2023
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELM
HILM
752
833
0
06 Jun 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
442
518
0
24 May 2023
TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zorik Gekhman
Jonathan Herzig
Roee Aharoni
Chen Elkind
Idan Szpektor
HILM
ELM
402
97
0
18 May 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
724
418
0
28 Apr 2023
The Internal State of an LLM Knows When It's Lying
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
A. Azaria
Tom Michael Mitchell
HILM
630
487
0
26 Apr 2023
Previous
1
2
3
Next