Policy Guided Tree Search for Enhanced LLM Reasoning

4 February 2025

Yang Li

LRM

ArXiv (abs)PDF HTML Github

Papers citing "Policy Guided Tree Search for Enhanced LLM Reasoning"

50 / 61 papers shown

Decoupling Understanding from Reasoning via Problem Space Mapping for Small-Scale Model Reasoning

173

07 Aug 2025

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

1.2K

689

03 Jan 2025

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

...

743

463

30 Dec 2024

Monte Carlo Tree Search based Space Transfer for Black-box OptimizationNeural Information Processing Systems (NeurIPS), 2024

349

10 Dec 2024

Interpretable Contrastive Monte Carlo Tree Search Reasoning

Aiwei Liu

Xuming Hu

Lijie Wen

LRM

652

02 Oct 2024

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

426

136

12 Aug 2024

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

811

1,621

06 Aug 2024

Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?

288

06 Jul 2024

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

Chaojie Wang

Yanchen Deng

Zhiyi Lyu

Liang Zeng

Jujie He

Shuicheng Yan

Bo An

LRM ReLM

393

109

20 Jun 2024

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024

Ling Yang

Joseph E. Gonzalez

Bin Cui

LLMAG LM&Ro LRM KELM

383

06 Jun 2024

AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning

313

25 May 2024

Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning

527

219

01 May 2024

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

819

261

14 Mar 2024

Can Large Language Models Reason and Plan?

Subbarao Kambhampati

LRM

321

139

07 Mar 2024

GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements

Alex Havrilla

Sharath Raparthy

Christoforus Nalmpantis

341

104

13 Feb 2024

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

Kaya Stechly

Subbarao Kambhampati

ReLM LRM

265

121

12 Feb 2024

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human AnnotationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Peiyi Wang

Lei Li

Zhihong Shao

R. X. Xu

Zhifang Sui

592

798

14 Dec 2023

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

...

Jascha Narain Sohl-Dickstein

Noah Fiedel

ALM LRM ReLM SyDa

687

276

11 Dec 2023

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Dorsa Sadigh

380

149

07 Dec 2023

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

589

2,282

20 Nov 2023

A Closer Look at the Self-Verification Abilities of Large Language Models in Logical ReasoningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Dong Yu

277

14 Nov 2023

Mistral 7B

Albert Q. Jiang

Alexandre Sablayrolles

A. Mensch

Chris Bamford

Devendra Singh Chaplot

...

523

3,278

10 Oct 2023

Large Language Models Cannot Self-Correct Reasoning YetInternational Conference on Learning Representations (ICLR), 2023

742

819

03 Oct 2023

Alphazero-like Tree-Search can Guide Large Language Model Decoding and TrainingInternational Conference on Machine Learning (ICML), 2023

Muning Wen

410

321

29 Sep 2023

Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language ModelsInternational Conference on Machine Learning (ICML), 2023

Ming Jin

486

105

20 Aug 2023

Graph of Thoughts: Solving Elaborate Problems with Large Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023

...

707

1,240

18 Aug 2023

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Zheng Yuan

Hongyi Yuan

Cheng Li

Guanting Dong

Keming Lu

Chuanqi Tan

Chang Zhou

Jingren Zhou

LRM ALM

428

311

03 Aug 2023

Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation

343

101

28 Jul 2023

Let's Verify Step by StepInternational Conference on Learning Representations (ICLR), 2023

1.8K

2,869

31 May 2023

Reasoning with Language Model is Planning with World ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

618

937

24 May 2023

GRACE: Discriminator-Guided Chain-of-Thought ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

461

24 May 2023

Tree of Thoughts: Deliberate Problem Solving with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

Dian Yu

764

3,713

17 May 2023

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought FrameworkAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

386

208

05 May 2023

Self-Evaluation Guided Beam Search for ReasoningNeural Information Processing Systems (NeurIPS), 2023

660

266

01 May 2023

GPT-4 Technical Report

...

5.3K

23,506

15 Mar 2023

Towards Reasoning in Large Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Jie Huang

Kevin Chen-Chuan Chang

LM&MA ELM LRM

1.3K

872

20 Dec 2022

Solving math word problems with process- and outcome-based feedback

428

640

25 Nov 2022

Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

1.7K

1,249

22 Nov 2022

Monte Carlo Tree Descent for Black-Box OptimizationNeural Information Processing Systems (NeurIPS), 2022

Yaoguang Zhai

Sicun Gao

127

01 Nov 2022

Automatic Chain of Thought Prompting in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

660

932

07 Oct 2022

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-ThoughtInternational Conference on Learning Representations (ICLR), 2022

Abulhair Saparov

He He

ELM LRM ReLM

1.1K

459

03 Oct 2022

Recipe for a General, Powerful, Scalable Graph TransformerNeural Information Processing Systems (NeurIPS), 2022

Ladislav Rampášek

Mikhail Galkin

Vijay Prakash Dwivedi

Anh Tuan Luu

Guy Wolf

Dominique Beaini

704

931

25 May 2022

Large Language Models are Zero-Shot ReasonersNeural Information Processing Systems (NeurIPS), 2022

1.7K

6,849

24 May 2022

Least-to-Most Prompting Enables Complex Reasoning in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022

...

927

1,636

21 May 2022

Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

3.7K

6,303

21 Mar 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.8K

17,183

28 Jan 2022

Representing Long-Range Context for Graph Neural Networks with Global AttentionNeural Information Processing Systems (NeurIPS), 2022

384

400

21 Jan 2022

Understanding over-squashing and bottlenecks on graphs via curvatureInternational Conference on Learning Representations (ICLR), 2021

Jake Topping

Francesco Di Giovanni

B. Chamberlain

Xiaowen Dong

M. Bronstein

821

617

29 Nov 2021

Training Verifiers to Solve Math Word Problems

...

1.6K

8,043

27 Oct 2021

Graph Neural Networks with Learnable Structural and Positional Representations

Vijay Prakash Dwivedi

769

454

15 Oct 2021