v1v2v3 (latest)

Chain of Thoughtlessness? An Analysis of CoT in Planning

8 May 2024

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Chain of Thoughtlessness? An Analysis of CoT in Planning"

50 / 102 papers shown

LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench

Kaya Stechly

Subbarao Kambhampati

LLMAG LRM ELM

405

20 Sep 2024

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoningInternational Conference on Learning Representations (ICLR), 2024

637

232

18 Sep 2024

EVINCE: Optimizing Multi-LLM Dialogues Using Conditional Statistics and Information Theory

Edward Y. Chang

AAML

136

26 Aug 2024

Algorithmic Language Models with Neurally Compiled Libraries

Lucas Saldyt

Subbarao Kambhampati

LRM

323

06 Jul 2024

Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning

265

01 Jul 2024

Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model

Minjoon Seo

317

21 Jun 2024

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

Chaojie Wang

Yanchen Deng

Zhiyi Lyu

Liang Zeng

Jujie He

Shuicheng Yan

Bo An

LRM ReLM

341

20 Jun 2024

Exploring and Benchmarking the Planning Capabilities of Large Language Models

Hanjun Dai

Dale Schuurmans

Noah Fiedel

Hanie Sedghi

187

18 Jun 2024

Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning

182

31 May 2024

SELF-[IN]CORRECT: LLMs Struggle with Refining Self-Generated ResponsesAAAI Conference on Artificial Intelligence (AAAI), 2024

Dongwei Jiang

Jingyu Zhang

Orion Weller

Nathaniel Weir

Benjamin Van Durme

Daniel Khashabi

225

04 Apr 2024

Multi-Conditional Ranking with Large Language Models

Pouya Pezeshkpour

Estevam R. Hruschka

LRM

179

30 Mar 2024

ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

163

21 Mar 2024

Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies

313

27 Feb 2024

How Interpretable are Reasoning Explanations from Prompting Large Language Models?

328

19 Feb 2024

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

Kaya Stechly

Subbarao Kambhampati

ReLM LRM

178

12 Feb 2024

Efficient Tool Use with Chain-of-Abstraction Reasoning

356

30 Jan 2024

Demystifying Chains, Trees, and Graphs of ThoughtsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

...

1.0K

25 Jan 2024

A Closer Look at the Self-Verification Abilities of Large Language Models in Logical ReasoningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Dong Yu

225

14 Nov 2023

KITAB: Evaluating LLMs on Constraint Satisfaction for Information RetrievalInternational Conference on Learning Representations (ICLR), 2023

Mert Yuksekgonul

177

24 Oct 2023

Large Language Models Cannot Self-Correct Reasoning YetInternational Conference on Learning Representations (ICLR), 2023

519

696

03 Oct 2023

Invalid Logic, Equivalent Gains: The Bizarreness of Reasoning in Language Model Prompting

125

20 Jul 2023

Boosting Language Models Reasoning with Chain-of-Knowledge PromptingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Xiang Li

311

105

10 Jun 2023

Deductive Verification of Chain-of-Thought ReasoningNeural Information Processing Systems (NeurIPS), 2023

496

194

06 Jun 2023

Faith and Fate: Limits of Transformers on CompositionalityNeural Information Processing Systems (NeurIPS), 2023

Xiang Lorraine Li

...

Xiang Ren

Yejin Choi

519

497

29 May 2023

On the Planning Abilities of Large Language Models : A Critical InvestigationNeural Information Processing Systems (NeurIPS), 2023

270

340

25 May 2023

Towards Revealing the Mystery behind Chain of Thought: A Theoretical PerspectiveNeural Information Processing Systems (NeurIPS), 2023

649

354

24 May 2023

Improving Factuality and Reasoning in Language Models through Multiagent DebateInternational Conference on Machine Learning (ICML), 2023

Yilun Du

Shuang Li

Antonio Torralba

J. Tenenbaum

Igor Mordatch

LLMAG LRM

351

1,182

23 May 2023

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive CritiquingInternational Conference on Learning Representations (ICLR), 2023

Zhihong Shao

Yujiu Yang

392

584

19 May 2023

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought PromptingNeural Information Processing Systems (NeurIPS), 2023

533

725

07 May 2023

GPT-4 Technical Report

...

4.6K

20,902

15 Mar 2023

Faithful Chain-of-Thought ReasoningInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

Marianna Apidianaki

456

317

31 Jan 2023

Reasoning with Language Model Prompting: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Ningyu Zhang

Shumin Deng

Chuanqi Tan

Fei Huang

Huajun Chen

ReLM ELM LRM

707

395

19 Dec 2022

Teaching Algorithmic Reasoning via In-context Learning

254

130

15 Nov 2022

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve ThemAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

...

518

1,558

17 Oct 2022

Automatic Chain of Thought Prompting in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

496

852

07 Oct 2022

Language Models are Multilingual Chain-of-Thought ReasonersInternational Conference on Learning Representations (ICLR), 2022

...

587

492

06 Oct 2022

ReAct: Synergizing Reasoning and Acting in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

Dian Yu

2.5K

5,256

06 Oct 2022

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-ThoughtInternational Conference on Learning Representations (ICLR), 2022

Abulhair Saparov

He He

ELM LRM ReLM

850

422

03 Oct 2022

Faithful Reasoning Using Large Language Models

Antonia Creswell

Murray Shanahan

ReLM LRM

193

139

30 Aug 2022

Limitations of Language Models in Arithmetic and Symbolic InductionAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

318

09 Aug 2022

Exploring Length Generalization in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

348

211

11 Jul 2022

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about ChangeNeural Information Processing Systems (NeurIPS), 2022

343

329

21 Jun 2022

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

...

697

2,150

09 Jun 2022

Large Language Models are Zero-Shot ReasonersNeural Information Processing Systems (NeurIPS), 2022

1.4K

6,087

24 May 2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

...

659

1,483

21 May 2022

Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

2.7K

5,537

21 Mar 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.3K

14,449

28 Jan 2022

Show Your Work: Scratchpads for Intermediate Computation with Language Models

Henryk Michalewski

...

544

920

30 Nov 2021

Training Verifiers to Solve Math Word Problems

...

1.1K

6,810

27 Oct 2021

A Diverse Corpus for Evaluating and Developing English Math Word Problem SolversAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

Shen-Yun Miao

Chao-Chun Liang

Keh-Yih Su

275

418

30 Jun 2021