v1v2v3 (latest)

Why Does ChatGPT Fall Short in Providing Truthful Answers?

20 April 2023

Shen Zheng

Jie Huang

Kevin Chen-Chuan Chang

HILM

AI4MH

ArXiv (abs)PDF HTML Github

Papers citing "Why Does ChatGPT Fall Short in Providing Truthful Answers?"

43 / 43 papers shown

Large Language Models Hallucination: A Comprehensive Survey

Aisha Alansari

Hamzah Luqman

HILM LRM

601

05 Oct 2025

LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions

...

415

23 Sep 2025

Exploring and Mitigating Fawning Hallucinations in Large Language Models

128

31 Aug 2025

How Does Response Length Affect Long-Form FactualityAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

279

29 May 2025

GraphEval: A Lightweight Graph-Based LLM Framework for Idea EvaluationInternational Conference on Learning Representations (ICLR), 2025

Tao Feng

Yihang Sun

Jiaxuan You

540

16 Mar 2025

Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful ComparatorsAAAI Conference on Artificial Intelligence (AAAI), 2024

652

28 Jan 2025

AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues

Oliver Bensch

158

21 Sep 2024

See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses

Yulong Chen

Yang Liu

Jianhao Yan

X. Bai

Ming Zhong

Yinghao Yang

Ziyi Yang

Chenguang Zhu

Yue Zhang

ALM ELM

238

16 Aug 2024

Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models

Zikai Xie

HILM LRM

620

09 Aug 2024

Improving Faithfulness of Large Language Models in Summarization via Sliding Generation and Self-Consistency

378

31 Jul 2024

How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions

353

06 Jul 2024

REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy

Nanyun Peng

285

11 Jun 2024

HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation

Wei Li

353

11 Jun 2024

Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework

307

05 Jun 2024

Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

489

03 Jun 2024

A Survey of Useful LLM Evaluation

Yen-Ting Lin

324

03 Jun 2024

Towards Rationality in Language and Multimodal Agents: A Survey

Yuan Yuan

Weijie J. Su

Camillo J. Taylor

Tanwi Mallick

LLMAG

449

01 Jun 2024

Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval

Hamed Haddadi

235

10 May 2024

Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean

...

340

16 Mar 2024

Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents

Ahmed Hassan Awadallah

Jennifer Neville

Nikhil Rao

330

27 Feb 2024

Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024

565

300

06 Feb 2024

Alignment for HonestyNeural Information Processing Systems (NeurIPS), 2023

Yuqing Yang

Ethan Chern

Xipeng Qiu

Graham Neubig

Pengfei Liu

318

12 Dec 2023

Axiomatic Preference Modeling for Longform Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Corby Rosset

Guoqing Zheng

Victor C. Dibia

Ahmed Hassan Awadallah

Paul Bennett

SyDa

229

02 Dec 2023

On the Calibration of Large Language Models and AlignmentConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Benfeng Xu

366

22 Nov 2023

SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check ConsistencyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

422

103

03 Nov 2023

Critical Role of Artificially Intelligent Conversational Chatbot

144

31 Oct 2023

Examining the Potential and Pitfalls of ChatGPT in Science and Engineering Problem-SolvingFrontiers in Education (FIE), 2023

248

12 Oct 2023

Large Language Models can Learn Rules

Jian Tang

369

10 Oct 2023

Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations

364

06 Oct 2023

Evaluating Hallucinations in Chinese Large Language Models

Qinyuan Cheng

Tianxiang Sun

Wenwei Zhang

Siyin Wang

Xiangyang Liu

...

Xipeng Qiu

317

05 Oct 2023

Dodo: Dynamic Contextual Compression for Decoder-only LMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

264

03 Oct 2023

Large Language Models Cannot Self-Correct Reasoning YetInternational Conference on Learning Representations (ICLR), 2023

742

819

03 Oct 2023

AutoHall: Automated Factuality Hallucination Dataset Generation for Large Language ModelsIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2023

676

30 Sep 2023

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

Xiang Li

Xin Jiang

Xuezhi Fang

HILM

251

11 Sep 2023

Are Emergent Abilities in Large Language Models just In-Context Learning?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Sheng Lu

Irina Bigoulaeva

Rachneet Sachdeva

Harish Tayyar Madabushi

Iryna Gurevych

LRM ELM ReLM

484

151

04 Sep 2023

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language ModelsComputational Linguistics (CL), 2023

...

851

953

03 Sep 2023

Leveraging Explainable AI to Analyze Researchers' Aspect-Based Sentiment about ChatGPTInternational Conference on Intelligent Human Computer Interaction (IHCI), 2023

S. Lakhanpal

Ajay Gupta

R. Agrawal

264

16 Aug 2023

RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models

Kevin Chen-Chuan Chang

Bryan Catanzaro

RALM

281

15 Aug 2023

Through the Lens of Core Competency: Survey on Evaluation of Large Language ModelsChina National Conference on Chinese Computational Linguistics (CNCCL), 2023

221

15 Aug 2023

The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models

273

01 Aug 2023

Citation: A Key to Building Responsible and Accountable Large Language Models

Jie Huang

Kevin Chen-Chuan Chang

HILM

402

05 Jul 2023

Towards Reasoning in Large Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Jie Huang

Kevin Chen-Chuan Chang

LM&MA ELM LRM

1.3K

872

20 Dec 2022

Can Language Models Be Specific? How?Annual Meeting of the Association for Computational Linguistics (ACL), 2022

Jie Huang

Kevin Chen-Chuan Chang

Jinjun Xiong

Wen-mei W. Hwu

241

11 Oct 2022