Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.06273
Cited By
Consistency Analysis of ChatGPT
11 March 2023
Myeongjun Jang
Thomas Lukasiewicz
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Consistency Analysis of ChatGPT"
28 / 28 papers shown
Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions
Jekaterina Novikova
Carol Anderson
Borhane Blili-Hamelin
Subhabrata Majumdar
HILM
69
0
0
01 May 2025
Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing
Jihyun Janice Ahn
Wenpeng Yin
SILM
LRM
58
1
0
02 Apr 2025
Large Language Models Often Say One Thing and Do Another
Ruoxi Xu
Hongyu Lin
Xianpei Han
Jia Zheng
Weixiang Zhou
Le Sun
Yingfei Sun
42
1
0
10 Mar 2025
Automated Consistency Analysis of LLMs
Aditya Patwardhan
Vivek Vaidya
Ashish Kundu
50
0
0
10 Feb 2025
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks
Yingzhe Peng
Xiaoting Qin
Zhiyang Zhang
Jue Zhang
Qingwei Lin
Xu Yang
Dongmei Zhang
Saravan Rajmohan
Qi Zhang
36
3
0
31 Oct 2024
Evaluating Consistencies in LLM responses through a Semantic Clustering of Question Answering
Yanggyu Lee
Jihie Kim
23
1
0
20 Oct 2024
A Prompt Engineering Approach and a Knowledge Graph based Framework for Tackling Legal Implications of Large Language Model Answers
George Hannah
Rita T. Sousa
Ioannis Dasoulas
Claudia dÁmato
AILaw
ELM
39
0
0
19 Oct 2024
MM-R
3
^3
3
: On (In-)Consistency of Multi-modal Large Language Models (MLLMs)
Shih-Han Chou
Shivam Chandhok
James J. Little
Leonid Sigal
35
0
0
07 Oct 2024
Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models
Yinhong Liu
Zhijiang Guo
Tianya Liang
Ehsan Shareghi
Ivan Vulić
Nigel Collier
79
0
0
03 Oct 2024
Logically Consistent Language Models via Neuro-Symbolic Integration
Diego Calanzone
Stefano Teso
Antonio Vergari
LRM
71
6
0
09 Sep 2024
A Survey on the Real Power of ChatGPT
Ming Liu
Ran Liu
Ye Zhu
Hua Wang
Youyang Qu
Rongsheng Li
Yongpan Sheng
Wray L. Buntine
36
2
0
22 Apr 2024
Scope Ambiguities in Large Language Models
Gaurav Kamath
Sebastian Schuster
Sowmya Vajjala
Siva Reddy
27
2
0
05 Apr 2024
Reasoning Runtime Behavior of a Program with LLM: How Far Are We?
Junkai Chen
Zhiyuan Pan
Xing Hu
Zhenhao Li
Ge Li
Xin Xia
LRM
32
20
0
25 Mar 2024
Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries
Yiqiao Jin
Mohit Chandra
Gaurav Verma
Yibo Hu
Munmun De Choudhury
Srijan Kumar
LM&MA
ELM
87
65
0
19 Oct 2023
Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity
A. Amirova
T. Fteropoulli
Nafiso Ahmed
Martin R. Cowie
Joel Z. Leibo
13
5
0
06 Sep 2023
Do You Trust ChatGPT? -- Perceived Credibility of Human and AI-Generated Content
Martin Huschens
Martin Briesch
Dominik Sobania
Franz Rothlauf
11
12
0
05 Sep 2023
Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models
Yuheng Huang
Jiayang Song
Zhijie Wang
Shengming Zhao
Huaming Chen
Felix Juefei-Xu
Lei Ma
28
34
0
16 Jul 2023
Evaluating Superhuman Models with Consistency Checks
Lukas Fluri
Daniel Paleka
Florian Tramèr
ELM
31
41
0
16 Jun 2023
Language models are not naysayers: An analysis of language models on negation benchmarks
Thinh Hung Truong
Timothy Baldwin
Karin Verspoor
Trevor Cohn
22
54
0
14 Jun 2023
Utilizing ChatGPT to Enhance Clinical Trial Enrollment
Georgios Peikos
S. Symeonidis
Pranav Kasela
G. Pasi
LM&MA
14
12
0
03 Jun 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Md Tahmid Rahman Laskar
M Saiful Bari
Mizanur Rahman
Md Amran Hossen Bhuiyan
Shafiq R. Joty
J. Huang
LM&MA
ELM
ALM
36
178
0
29 May 2023
Separating form and meaning: Using self-consistency to quantify task understanding across multiple senses
Xenia Ohmer
Elia Bruni
Dieuwke Hupkes
LRM
10
13
0
19 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
27
81
0
19 May 2023
In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Xinyue Shen
Z. Chen
Michael Backes
Yang Zhang
19
54
0
18 Apr 2023
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning
Viet Dac Lai
Nghia Trung Ngo
Amir Pouran Ben Veyseh
Hieu Man
Franck Dernoncourt
Trung Bui
Thien Huu Nguyen
ELM
LM&MA
25
267
0
12 Apr 2023
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLM
LRM
218
189
0
24 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,835
0
18 Apr 2021
1