ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.06813
  4. Cited By
Policy Guided Tree Search for Enhanced LLM Reasoning

Policy Guided Tree Search for Enhanced LLM Reasoning

4 February 2025
Yang Li
    LRM
ArXiv (abs)PDFHTMLGithub

Papers citing "Policy Guided Tree Search for Enhanced LLM Reasoning"

50 / 61 papers shown
Decoupling Understanding from Reasoning via Problem Space Mapping for Small-Scale Model Reasoning
Decoupling Understanding from Reasoning via Problem Space Mapping for Small-Scale Model Reasoning
Li Wang
Changhao Zhang
Zengqi Xiu
Kai Lu
Xin Yu
Kui Zhang
Wenjun Wu
LRM
173
0
0
07 Aug 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALMLRM
1.2K
689
0
03 Jan 2025
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Xingyu Chen
Jiahao Xu
Tian Liang
Zhiwei He
Jianhui Pang
...
Zizhuo Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRMReLM
743
463
0
30 Dec 2024
Monte Carlo Tree Search based Space Transfer for Black-box Optimization
Monte Carlo Tree Search based Space Transfer for Black-box OptimizationNeural Information Processing Systems (NeurIPS), 2024
Shukuan Wang
Ke Xue
Lei Song
Xiaobin Huang
Chao Qian
349
8
0
10 Dec 2024
Interpretable Contrastive Monte Carlo Tree Search Reasoning
Interpretable Contrastive Monte Carlo Tree Search Reasoning
Zitian Gao
Boye Niu
Xuzheng He
Haotian Xu
Hongzhang Liu
Aiwei Liu
Xuming Hu
Lijie Wen
LRM
652
65
0
02 Oct 2024
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Zhenting Qi
Mingyuan Ma
Jiahang Xu
Li Zhang
Fan Yang
Mao Yang
ReLMLRM
426
136
0
12 Aug 2024
Scaling LLM Test-Time Compute Optimally can be More Effective than
  Scaling Model Parameters
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
811
1,621
0
06 Aug 2024
Solving for X and Beyond: Can Large Language Models Solve Complex Math
  Problems with More-Than-Two Unknowns?
Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?
Kuei-Chun Kao
Ruochen Wang
Cho-Jui Hsieh
ELMLRM
288
7
0
06 Jul 2024
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
Chaojie Wang
Yanchen Deng
Zhiyi Lyu
Liang Zeng
Jujie He
Shuicheng Yan
Bo An
LRMReLM
393
109
0
20 Jun 2024
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language
  Models
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Ling Yang
Zhaochen Yu
Tianjun Zhang
Shiyi Cao
Minkai Xu
Wentao Zhang
Joseph E. Gonzalez
Bin Cui
LLMAGLM&RoLRMKELM
383
94
0
06 Jun 2024
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive
  Environmental Learning
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning
Minghao Chen
Yihang Li
Yanting Yang
Shiyu Yu
Binbin Lin
Xiaofei He
LLMAG
313
0
0
25 May 2024
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference
  Learning
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning
Yuxi Xie
Anirudh Goyal
Wenyue Zheng
Min-Yen Kan
Timothy Lillicrap
Kenji Kawaguchi
Michael Shieh
ReLMLRM
527
219
0
01 May 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before
  Speaking
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAGReLMLRM
819
261
0
14 Mar 2024
Can Large Language Models Reason and Plan?
Can Large Language Models Reason and Plan?
Subbarao Kambhampati
LRM
321
139
0
07 Mar 2024
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and
  Local Refinements
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alex Havrilla
Sharath Raparthy
Christoforus Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Roberta Railneau
ReLMLRM
341
104
0
13 Feb 2024
On the Self-Verification Limitations of Large Language Models on
  Reasoning and Planning Tasks
On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks
Kaya Stechly
Kaya Stechly
Subbarao Kambhampati
ReLMLRM
265
121
0
12 Feb 2024
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human
  Annotations
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human AnnotationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Peiyi Wang
Lei Li
Zhihong Shao
R. X. Xu
Damai Dai
Yifei Li
Deli Chen
Y.Wu
Zhifang Sui
AIMatLRMALM
592
798
0
14 Dec 2023
Beyond Human Data: Scaling Self-Training for Problem-Solving with
  Language Models
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Avi Singh
John D. Co-Reyes
Rishabh Agarwal
Ankesh Anand
Piyush Patil
...
Yamini Bansal
Ethan Dyer
Behnam Neyshabur
Jascha Narain Sohl-Dickstein
Noah Fiedel
ALMLRMReLMSyDa
687
276
0
11 Dec 2023
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Chengshu Li
Jacky Liang
Andy Zeng
Xinyun Chen
Karol Hausman
Dorsa Sadigh
Sergey Levine
Fei-Fei Li
Fei Xia
Brian Ichter
LLMAGLRM
380
149
0
07 Dec 2023
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
David Rein
Betty Li Hou
Asa Cooper Stickland
Jackson Petty
Richard Yuanzhe Pang
Julien Dirani
Julian Michael
Samuel R. Bowman
AI4MHELM
589
2,282
0
20 Nov 2023
A Closer Look at the Self-Verification Abilities of Large Language
  Models in Logical Reasoning
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical ReasoningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Ruixin Hong
Hongming Zhang
Xinyu Pang
Dong Yu
Changshui Zhang
LRM
277
46
0
14 Nov 2023
Mistral 7B
Mistral 7B
Albert Q. Jiang
Alexandre Sablayrolles
A. Mensch
Chris Bamford
Devendra Singh Chaplot
...
Teven Le Scao
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoELRM
523
3,278
0
10 Oct 2023
Large Language Models Cannot Self-Correct Reasoning Yet
Large Language Models Cannot Self-Correct Reasoning YetInternational Conference on Learning Representations (ICLR), 2023
Jie Huang
Xinyun Chen
Swaroop Mishra
Huaixiu Steven Zheng
Adams Wei Yu
Xinying Song
Denny Zhou
ReLMLRM
742
819
0
03 Oct 2023
Alphazero-like Tree-Search can Guide Large Language Model Decoding and
  Training
Alphazero-like Tree-Search can Guide Large Language Model Decoding and TrainingInternational Conference on Machine Learning (ICML), 2023
Xidong Feng
Bo Liu
Muning Wen
Alexander Shmakov
Ying Wen
Weinan Zhang
Jun Wang
LRMAI4CE
410
321
0
29 Sep 2023
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language
  Models
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Bilgehan Sel
Ahmad S. Al-Tawaha
Vanshaj Khattar
R. Jia
Ming Jin
LM&RoLRM
486
105
0
20 Aug 2023
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Graph of Thoughts: Solving Elaborate Problems with Large Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023
Maciej Besta
Nils Blach
Aleš Kubíček
Robert Gerstenberger
Michal Podstawski
...
Joanna Gajda
Tomasz Lehmann
H. Niewiadomski
Piotr Nyczyk
Torsten Hoefler
LRMAI4CELM&Ro
707
1,240
0
18 Aug 2023
Scaling Relationship on Learning Mathematical Reasoning with Large
  Language Models
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
Zheng Yuan
Hongyi Yuan
Cheng Li
Guanting Dong
Keming Lu
Chuanqi Tan
Chang Zhou
Jingren Zhou
LRMALM
428
311
0
03 Aug 2023
Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
Xuefei Ning
Zinan Lin
Zixuan Zhou
Zifu Wang
Huazhong Yang
Yu Wang
ReLMLRM
343
101
0
28 Jul 2023
Let's Verify Step by Step
Let's Verify Step by StepInternational Conference on Learning Representations (ICLR), 2023
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALMOffRLLRM
1.8K
2,869
0
31 May 2023
Reasoning with Language Model is Planning with World Model
Reasoning with Language Model is Planning with World ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shibo Hao
Yi Gu
Haodi Ma
Joshua Jiahua Hong
Zhen Wang
D. Wang
Zhiting Hu
ReLMLRMLLMAG
618
937
0
24 May 2023
GRACE: Discriminator-Guided Chain-of-Thought Reasoning
GRACE: Discriminator-Guided Chain-of-Thought ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Muhammad Khalifa
Lajanugen Logeswaran
Moontae Lee
Ho Hin Lee
Lu Wang
LRM
461
52
0
24 May 2023
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Tree of Thoughts: Deliberate Problem Solving with Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Shunyu Yao
Dian Yu
Jeffrey Zhao
Izhak Shafran
Thomas Griffiths
Yuan Cao
Karthik Narasimhan
LM&RoLRMAI4CE
764
3,713
0
17 May 2023
Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework
Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought FrameworkAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ruochen Zhao
Xingxuan Li
Shafiq Joty
Chengwei Qin
Lidong Bing
LRMKELM
386
208
0
05 May 2023
Self-Evaluation Guided Beam Search for Reasoning
Self-Evaluation Guided Beam Search for ReasoningNeural Information Processing Systems (NeurIPS), 2023
Yuxi Xie
Kenji Kawaguchi
Yiran Zhao
Xu Zhao
MingSung Kan
Junxian He
Qizhe Xie
LRM
660
266
0
01 May 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAGMLLM
5.3K
23,506
0
15 Mar 2023
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A SurveyAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Jie Huang
Kevin Chen-Chuan Chang
LM&MAELMLRM
1.3K
872
0
20 Dec 2022
Solving math word problems with process- and outcome-based feedback
Solving math word problems with process- and outcome-based feedback
J. Uesato
Nate Kushman
Ramana Kumar
Francis Song
Noah Y. Siegel
L. Wang
Antonia Creswell
G. Irving
I. Higgins
FaMLReLMAIMatLRM
428
640
0
25 Nov 2022
Program of Thoughts Prompting: Disentangling Computation from Reasoning
  for Numerical Reasoning Tasks
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen
Xueguang Ma
Xinyi Wang
William W. Cohen
ReLMReCodLRM
1.7K
1,249
0
22 Nov 2022
Monte Carlo Tree Descent for Black-Box Optimization
Monte Carlo Tree Descent for Black-Box OptimizationNeural Information Processing Systems (NeurIPS), 2022
Yaoguang Zhai
Sicun Gao
127
5
0
01 Nov 2022
Automatic Chain of Thought Prompting in Large Language Models
Automatic Chain of Thought Prompting in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLMLRM
660
932
0
07 Oct 2022
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of
  Chain-of-Thought
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-ThoughtInternational Conference on Learning Representations (ICLR), 2022
Abulhair Saparov
He He
ELMLRMReLM
1.1K
459
0
03 Oct 2022
Recipe for a General, Powerful, Scalable Graph Transformer
Recipe for a General, Powerful, Scalable Graph TransformerNeural Information Processing Systems (NeurIPS), 2022
Ladislav Rampášek
Mikhail Galkin
Vijay Prakash Dwivedi
Anh Tuan Luu
Guy Wolf
Dominique Beaini
704
931
0
25 May 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot ReasonersNeural Information Processing Systems (NeurIPS), 2022
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLMLRM
1.7K
6,849
0
24 May 2022
Least-to-Most Prompting Enables Complex Reasoning in Large Language
  Models
Least-to-Most Prompting Enables Complex Reasoning in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Denny Zhou
Nathanael Scharli
Le Hou
Jason W. Wei
Nathan Scales
...
Dale Schuurmans
Claire Cui
Olivier Bousquet
Quoc Le
Ed H. Chi
RALMLRMAI4CE
927
1,636
0
21 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLMBDLLRMAI4CE
3.7K
6,303
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
2.8K
17,183
0
28 Jan 2022
Representing Long-Range Context for Graph Neural Networks with Global
  Attention
Representing Long-Range Context for Graph Neural Networks with Global AttentionNeural Information Processing Systems (NeurIPS), 2022
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
384
400
0
21 Jan 2022
Understanding over-squashing and bottlenecks on graphs via curvature
Understanding over-squashing and bottlenecks on graphs via curvatureInternational Conference on Learning Representations (ICLR), 2021
Jake Topping
Francesco Di Giovanni
B. Chamberlain
Xiaowen Dong
M. Bronstein
821
617
0
29 Nov 2021
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
1.6K
8,043
0
27 Oct 2021
Graph Neural Networks with Learnable Structural and Positional
  Representations
Graph Neural Networks with Learnable Structural and Positional Representations
Vijay Prakash Dwivedi
Anh Tuan Luu
T. Laurent
Yoshua Bengio
Xavier Bresson
GNN
769
454
0
15 Oct 2021
12
Next
Page 1 of 2