v1v2v3v4 (latest)

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

International Conference on Learning Representations (ICLR), 2024

3 October 2024

ArXiv (abs)PDF HTML HuggingFace (49 upvotes)

Papers citing "LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations"

50 / 131 papers shown

Robust Hallucination Detection in LLMs via Adaptive Token Selection

450

10 Apr 2025

Epistemic Alignment: A Mediating Framework for User-LLM Knowledge Delivery

310

01 Apr 2025

Learning to Instruct for Visual Instruction Tuning

421

28 Mar 2025

A Survey of Large Language Model Agents for Question Answering

Murong Yue

LLMAG LM&MA ELM

292

24 Mar 2025

Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations

Ziwei Ji

L. Yu

Yeskendir Koishekenov

441

18 Mar 2025

Beyond Next Token Probabilities: Learnable, Fast Detection of Hallucinations and Data Contamination on LLM Output Distributions

419

18 Mar 2025

AIstorian lets AI be a historian: A KG-powered multi-agent system for accurate biography generation

226

14 Mar 2025

Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation

528

11 Mar 2025

Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding Layers

Zicong He

Boxuan Zhang

Lu Cheng

365

04 Mar 2025

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation

637

28 Feb 2025

Representation Engineering for Large-Language Models: Survey and Research Challenges

410

24 Feb 2025

CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-ThoughtAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Boxuan Zhang

Ruqi Zhang

LRM

317

24 Feb 2025

Confidence Improves Self-Consistency in LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

371

10 Feb 2025

Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow ThinkingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

354

02 Jan 2025

HalluCana: Fixing LLM Hallucination with A Canary LookaheadNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

314

10 Dec 2024

VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video UnderstandingComputer Vision and Pattern Recognition (CVPR), 2024

336

03 Dec 2024

Toward Automated Validation of Language Model Synthesized Test Cases using Semantic Entropy

Hamed Taherkhani

Jiho Shin

Muhammad Ammar Tahir

Md Rakib Hossain Misu

Vineet Sunil Gattani

Hadi Hemmati

294

13 Nov 2024

MLLM can see? Dynamic Correction Decoding for Hallucination MitigationInternational Conference on Learning Representations (ICLR), 2024

788

15 Oct 2024

FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs

378

03 Oct 2024

Internal Consistency and Self-Feedback in Large Language Models: A Survey

...

497

19 Jul 2024

Truth is Universal: Robust Detection of Lies in LLMs

237

03 Jul 2024

Estimating Knowledge in Large Language Models Without Generating a Single TokenConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Daniela Gottesman

Mor Geva

266

18 Jun 2024

Detection-Correction Structure via General Language Model for Grammatical Error Correction

Wei Li

Houfeng Wang

298

28 May 2024

Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?

283

27 May 2024

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

415

226

09 May 2024

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs

286

15 Apr 2024

Large Language Models are Contrastive Reasoners

Liang Yao

ReLM ELM LRM

358

13 Mar 2024

Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem

Qipeng Guo

Xipeng Qiu

159

06 Mar 2024

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension

280

28 Feb 2024

INSIDE: LLMs' Internal States Retain the Power of Hallucination DetectionInternational Conference on Learning Representations (ICLR), 2024

313

201

06 Feb 2024

Language Writ Large: LLMs, ChatGPT, Grounding, Meaning and Understanding

S. Harnad

214

03 Feb 2024

On Early Detection of Hallucinations in Factual Question Answering

371

19 Dec 2023

Weakly Supervised Detection of Hallucinations in LLM Activations

254

05 Dec 2023

Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Kevin Liu

Stephen Casper

Dylan Hadfield-Menell

Jacob Andreas

HILM

265

27 Nov 2023

Fine-tuning Language Models for FactualityInternational Conference on Learning Representations (ICLR), 2023

Katherine Tian

Eric Mitchell

Huaxiu Yao

Christopher D. Manning

Chelsea Finn

KELM HILM SyDa

279

241

14 Nov 2023

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

...

441

1,930

09 Nov 2023

The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Omer Goldman

252

18 Oct 2023

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

Samuel Marks

Max Tegmark

HILM

475

357

10 Oct 2023

The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive RemediationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

S.M. Towhidul Islam Tonmoy

314

183

08 Oct 2023

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language ModelsInternational Conference on Learning Representations (ICLR), 2023

Mert Yuksekgonul

211

26 Sep 2023

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

286

279

07 Sep 2023

Gender bias and stereotypes in Large Language ModelsInternational Conference on Climate Informatics (ICCI), 2023

Hadas Kotek

Rikker Dockum

David Q. Sun

354

28 Aug 2023

Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language ModelsIEEE Transactions on Software Engineering (TSE), 2023

304

16 Jul 2023

Personality Traits in Large Language Models

Gregory Serapio-García

710

180

01 Jul 2023

Still No Lie Detector for Language Models: Probing Empirical and Conceptual RoadblocksPhilosophical Studies (Philos. Stud.), 2023

B. Levinstein

Daniel A. Herrmann

260

30 Jun 2023

Inference-Time Intervention: Eliciting Truthful Answers from a Language ModelNeural Information Processing Systems (NeurIPS), 2023

752

833

06 Jun 2023

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human FeedbackConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Christopher D. Manning

442

518

24 May 2023

TrueTeacher: Learning Factual Consistency Evaluation with Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

402

18 May 2023

Dissecting Recall of Factual Associations in Auto-Regressive Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

724

418

28 Apr 2023

The Internal State of an LLM Knows When It's LyingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

A. Azaria

Tom Michael Mitchell

HILM

630

487

26 Apr 2023