ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.10513
  4. Cited By
Why Does ChatGPT Fall Short in Providing Truthful Answers?

Why Does ChatGPT Fall Short in Providing Truthful Answers?

20 April 2023
Shen Zheng
Jie Huang
Kevin Chen-Chuan Chang
    HILM
    AI4MH
ArXivPDFHTML

Papers citing "Why Does ChatGPT Fall Short in Providing Truthful Answers?"

41 / 41 papers shown
Title
GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
Tao Feng
Yihang Sun
Jiaxuan You
48
0
0
16 Mar 2025
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
Dingkang Yang
Dongling Xiao
Jinjie Wei
Mingcheng Li
Zhaoyu Chen
Ke Li
L. Zhang
HILM
90
3
0
28 Jan 2025
AI Assistants for Spaceflight Procedures: Combining Generative
  Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge
  Graphs With Augmented Reality Cues
AI Assistants for Spaceflight Procedures: Combining Generative Pre-Trained Transformer and Retrieval-Augmented Generation on Knowledge Graphs With Augmented Reality Cues
Oliver Bensch
Leonie Bensch
Tommy Nilsson
Florian Saling
Bernd Bewer
Sophie Jentzsch
Tobias Hecking
J. Nathan Kutz
14
1
0
21 Sep 2024
See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering
  LLM Weaknesses
See What LLMs Cannot Answer: A Self-Challenge Framework for Uncovering LLM Weaknesses
Yulong Chen
Yang Liu
Jianhao Yan
X. Bai
Ming Zhong
Yinghao Yang
Ziyi Yang
Chenguang Zhu
Yue Zhang
ALM
ELM
35
5
0
16 Aug 2024
Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models
Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models
Zikai Xie
HILM
LRM
51
5
0
09 Aug 2024
Improving Faithfulness of Large Language Models in Summarization via
  Sliding Generation and Self-Consistency
Improving Faithfulness of Large Language Models in Summarization via Sliding Generation and Self-Consistency
Taiji Li
Zhi Li
Yin Zhang
HILM
17
5
0
31 Jul 2024
How do you know that? Teaching Generative Language Models to Reference
  Answers to Biomedical Questions
How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions
Bojana Bašaragin
Adela Ljajić
Darija Medvecki
Lorenzo Cassano
Milos Kosprdic
Nikola Milosevic
LM&MA
24
2
0
06 Jul 2024
REAL Sampling: Boosting Factuality and Diversity of Open-Ended
  Generation via Asymptotic Entropy
REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
HILM
33
2
0
11 Jun 2024
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level
  Hallucination Evaluation
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation
Wen Luo
Tianshu Shen
Wei Li
Guangyue Peng
Richeng Xuan
Houfeng Wang
Xi Yang
HILM
26
10
0
11 Jun 2024
Luna: An Evaluation Foundation Model to Catch Language Model
  Hallucinations with High Accuracy and Low Cost
Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost
Masha Belyi
Robert Friel
Shuai Shao
Atindriyo Sanyal
HILM
RALM
59
5
0
03 Jun 2024
A Survey of Useful LLM Evaluation
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
24
12
0
03 Jun 2024
Mitigating Hallucinations in Large Language Models via
  Self-Refinement-Enhanced Knowledge Retrieval
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval
Mengjia Niu
Hao Li
Jie Shi
Hamed Haddadi
Fan Mo
HILM
32
10
0
10 May 2024
Optimizing Language Augmentation for Multilingual Large Language Models:
  A Case Study on Korean
Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean
Changsu Choi
Yongbin Jeong
Seoyoon Park
Inho Won
HyeonSeok Lim
...
Yiseul Lee
HyeJin Lee
Younggyun Hahm
Hansaem Kim
Kyungtae Lim
29
10
0
16 Mar 2024
Researchy Questions: A Dataset of Multi-Perspective, Decompositional
  Questions for LLM Web Agents
Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents
Corby Rosset
Ho-Lam Chung
Guanghui Qin
Ethan C. Chau
Zhuo Feng
Ahmed Hassan Awadallah
Jennifer Neville
Nikhil Rao
20
10
0
27 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILM
ELM
PILM
16
152
0
06 Feb 2024
Alignment for Honesty
Alignment for Honesty
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
23
27
0
12 Dec 2023
Axiomatic Preference Modeling for Longform Question Answering
Axiomatic Preference Modeling for Longform Question Answering
Corby Rosset
Guoqing Zheng
Victor C. Dibia
Ahmed Hassan Awadallah
Paul Bennett
SyDa
19
3
0
02 Dec 2023
On the Calibration of Large Language Models and Alignment
On the Calibration of Large Language Models and Alignment
Chiwei Zhu
Benfeng Xu
Quan Wang
Yongdong Zhang
Zhendong Mao
61
32
0
22 Nov 2023
SAC3: Reliable Hallucination Detection in Black-Box Language Models via
  Semantic-aware Cross-check Consistency
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency
Jiaxin Zhang
Zhuohang Li
Kamalika Das
Bradley Malin
Kumar Sricharan
HILM
LRM
19
56
0
03 Nov 2023
Critical Role of Artificially Intelligent Conversational Chatbot
Critical Role of Artificially Intelligent Conversational Chatbot
S. A. Mostafa
Md Z. Islam
Mohammad Z. Islam
Fairose Jeehan
Saujanna Jafreen
Raihan U. Islam
AI4MH
21
0
0
31 Oct 2023
Examining the Potential and Pitfalls of ChatGPT in Science and
  Engineering Problem-Solving
Examining the Potential and Pitfalls of ChatGPT in Science and Engineering Problem-Solving
Karen D. Wang
E. Burkholder
Carl E. Wieman
S. Salehi
Nicholas Haber
AI4CE
ELM
27
33
0
12 Oct 2023
Large Language Models can Learn Rules
Large Language Models can Learn Rules
Zhaocheng Zhu
Yuan Xue
Xinyun Chen
Denny Zhou
Jian Tang
Dale Schuurmans
Hanjun Dai
LRM
ReLM
9
62
0
10 Oct 2023
Chain of Natural Language Inference for Reducing Large Language Model
  Ungrounded Hallucinations
Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations
Deren Lei
Yaxi Li
Mengya Hu
Mingyu Wang
Vincent Yun
Emily Ching
Eslam Kamal
HILM
LRM
24
39
0
06 Oct 2023
Evaluating Hallucinations in Chinese Large Language Models
Evaluating Hallucinations in Chinese Large Language Models
Qinyuan Cheng
Tianxiang Sun
Wenwei Zhang
Siyin Wang
Xiangyang Liu
...
Junliang He
Mianqiu Huang
Zhangyue Yin
Kai Chen
Xipeng Qiu
HILM
ELM
25
22
0
05 Oct 2023
Dodo: Dynamic Contextual Compression for Decoder-only LMs
Dodo: Dynamic Contextual Compression for Decoder-only LMs
Guanghui Qin
Corby Rosset
Ethan C. Chau
Nikhil Rao
Benjamin Van Durme
11
7
0
03 Oct 2023
Large Language Models Cannot Self-Correct Reasoning Yet
Large Language Models Cannot Self-Correct Reasoning Yet
Jie Huang
Xinyun Chen
Swaroop Mishra
Huaixiu Steven Zheng
Adams Wei Yu
Xinying Song
Denny Zhou
ReLM
LRM
6
415
0
03 Oct 2023
AutoHall: Automated Hallucination Dataset Generation for Large Language
  Models
AutoHall: Automated Hallucination Dataset Generation for Large Language Models
Zouying Cao
Yifei Yang
Hai Zhao
HILM
8
8
0
30 Sep 2023
Quantifying and Attributing the Hallucination of Large Language Models
  via Association Analysis
Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis
LI DU
Yequan Wang
Xingrun Xing
Yiqun Ya
Xiang Li
Xin Jiang
Xuezhi Fang
HILM
15
12
0
11 Sep 2023
Are Emergent Abilities in Large Language Models just In-Context
  Learning?
Are Emergent Abilities in Large Language Models just In-Context Learning?
Sheng Lu
Irina Bigoulaeva
Rachneet Sachdeva
Harish Tayyar Madabushi
Iryna Gurevych
LRM
ELM
ReLM
49
89
0
04 Sep 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large
  Language Models
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
A. Luu
Wei Bi
Freda Shi
Shuming Shi
RALM
LRM
HILM
36
507
0
03 Sep 2023
Leveraging Explainable AI to Analyze Researchers' Aspect-Based Sentiment
  about ChatGPT
Leveraging Explainable AI to Analyze Researchers' Aspect-Based Sentiment about ChatGPT
S. Lakhanpal
Ajay Gupta
R. Agrawal
11
0
0
16 Aug 2023
RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder
  Language Models
RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models
Jie Huang
Wei Ping
Peng-Tao Xu
M. Shoeybi
Kevin Chen-Chuan Chang
Bryan Catanzaro
RALM
22
33
0
15 Aug 2023
Through the Lens of Core Competency: Survey on Evaluation of Large
  Language Models
Through the Lens of Core Competency: Survey on Evaluation of Large Language Models
Ziyu Zhuang
Qiguang Chen
Longxuan Ma
Mingda Li
Yi Han
Yushan Qian
Haopeng Bai
Zixian Feng
Weinan Zhang
Ting Liu
ELM
19
9
0
15 Aug 2023
The Hitchhiker's Guide to Program Analysis: A Journey with Large
  Language Models
The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models
Haonan Li
Yu Hao
Yizhuo Zhai
Zhiyun Qian
LLMAG
17
24
0
01 Aug 2023
Citation: A Key to Building Responsible and Accountable Large Language
  Models
Citation: A Key to Building Responsible and Accountable Large Language Models
Jie Huang
Kevin Chen-Chuan Chang
HILM
33
16
0
05 Jul 2023
Emergent autonomous scientific research capabilities of large language
  models
Emergent autonomous scientific research capabilities of large language models
Daniil A. Boiko
R. MacKnight
Gabe Gomes
ELM
LM&Ro
AI4CE
LLMAG
101
115
0
11 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
206
2,232
0
22 Mar 2023
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A Survey
Jie Huang
Kevin Chen-Chuan Chang
LM&MA
ELM
LRM
17
576
0
20 Dec 2022
Can Language Models Be Specific? How?
Can Language Models Be Specific? How?
Jie Huang
Kevin Chen-Chuan Chang
Jinjun Xiong
Wen-mei W. Hwu
57
8
0
11 Oct 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
396
2,576
0
03 Sep 2019
1