Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14279
Cited By
Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs
23 May 2023
Angelica Chen
Jason Phang
Alicia Parrish
Vishakh Padmakumar
Chen Zhao
Sam Bowman
Kyunghyun Cho
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs"
22 / 22 papers shown
Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions
Jekaterina Novikova
Carol Anderson
Borhane Blili-Hamelin
Subhabrata Majumdar
HILM
69
0
0
01 May 2025
Can LLMs Assist Computer Education? an Empirical Case Study of DeepSeek
Dongfu Xiao
Chen Gao
Zhengquan Luo
Chi Liu
Sheng Shen
ELM
59
0
0
01 Apr 2025
BioAgents: Democratizing Bioinformatics Analysis with Multi-Agent Systems
Nikita Mehandru
Amanda K. Hall
Olesya Melnichenko
Yulia Dubinina
Daniel Tsirulnikov
David Bamman
Ahmed Alaa
Scott Saponas
Venkat S. Malladi
36
3
0
10 Jan 2025
Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency
Zenan Li
Yifan Wu
Zhaoyu Li
Xinming Wei
Xian Zhang
Fan Yang
Xiaoxing Ma
39
3
0
28 Oct 2024
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder
James Chua
Tomek Korbak
Henry Sleight
John Hughes
Robert Long
Ethan Perez
Miles Turpin
Owain Evans
KELM
AIFin
LRM
35
12
0
17 Oct 2024
Sensitivity of Generative VLMs to Semantically and Lexically Altered Prompts
Sri Harsha Dumpala
Aman Jaiswal
Chandramouli Shama Sastry
E. Milios
Sageev Oore
Hassan Sajjad
VLM
16
2
0
16 Oct 2024
AutoPenBench: Benchmarking Generative Agents for Penetration Testing
Luca Gioacchini
Marco Mellia
Idilio Drago
Alexander Delsanto
G. Siracusano
Roberto Bifulco
ELM
29
5
0
04 Oct 2024
FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats
Xuanliang Zhang
Dingzirui Wang
Longxu Dou
Baoxin Wang
Dayong Wu
Qingfu Zhu
Wanxiang Che
LMTD
ReLM
39
2
0
16 Aug 2024
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil
Marcelo Sartori Locatelli
Matheus Prado Miranda
Igor Joaquim da Silva Costa
Matheus Torres Prates
Victor Thomé
...
Tomas Lacerda
Adriana Pagano
Eduardo Rios Neto
Wagner Meira Jr.
Virgílio A. F. Almeida
ELM
51
1
0
09 Aug 2024
Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison
Qian Yang
Weixiang Yan
Aishwarya Agrawal
CoGe
26
4
0
10 Jul 2024
Are LLMs classical or nonmonotonic reasoners? Lessons from generics
Alina Leidinger
R. Rooij
Ekaterina Shutova
LRM
26
3
0
05 Jun 2024
RORA: Robust Free-Text Rationale Evaluation
Zhengping Jiang
Yining Lu
Hanjie Chen
Daniel Khashabi
Benjamin Van Durme
Anqi Liu
43
1
0
28 Feb 2024
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Mikail Khona
Maya Okawa
Jan Hula
Rahul Ramesh
Kento Nishi
Robert P. Dick
Ekdeep Singh Lubana
Hidenori Tanaka
38
5
0
12 Feb 2024
DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models
Wendi Cui
Jiaxin Zhang
Zhuohang Li
Lopez Damien
Kamalika Das
Bradley Malin
Kumar Sricharan
17
2
0
04 Jan 2024
Towards Evaluating AI Systems for Moral Status Using Self-Reports
Ethan Perez
Robert Long
ELM
31
8
0
14 Nov 2023
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency
Jiaxin Zhang
Zhuohang Li
Kamalika Das
Bradley Malin
Kumar Sricharan
HILM
LRM
19
56
0
03 Nov 2023
Lyfe Agents: Generative agents for low-cost real-time social interactions
Zhao Kaiya
Michelangelo Naim
J. Kondic
Manuel Cortes
Jiaxin Ge
Shuying Luo
Guangyu Robert Yang
Andrew Ahn
VLM
40
32
0
03 Oct 2023
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAG
AI4CE
LM&Ro
39
1,112
0
22 Aug 2023
ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory
Chenxu Hu
Jie Fu
Chenzhuang Du
Simian Luo
J. Zhao
Hang Zhao
KELM
LLMAG
22
104
0
06 Jun 2023
BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief
Nora Kassner
Oyvind Tafjord
Hinrich Schütze
Peter Clark
KELM
LRM
231
64
0
29 Sep 2021
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
258
346
0
01 Feb 2021
What Makes Good In-Context Examples for GPT-
3
3
3
?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
275
1,312
0
17 Jan 2021
1