ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.14135
  4. Cited By
Scaling of Search and Learning: A Roadmap to Reproduce o1 from
  Reinforcement Learning Perspective

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

18 December 2024
Zhiyuan Zeng
Qinyuan Cheng
Zhangyue Yin
Bo Wang
Shimin Li
Yunhua Zhou
Qipeng Guo
Xuanjing Huang
Xipeng Qiu
    ELM
    AI4TS
    LRM
ArXivPDFHTML

Papers citing "Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective"

6 / 6 papers shown
Title
Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey
Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey
Da Zheng
Lun Du
Junwei Su
Yuchen Tian
Yuqi Zhu
Jintian Zhang
Lanning Wei
Ningyu Zhang
H. Chen
LRM
41
0
0
06 May 2025
MARFT: Multi-Agent Reinforcement Fine-Tuning
MARFT: Multi-Agent Reinforcement Fine-Tuning
Junwei Liao
Muning Wen
J. Wang
W. Zhang
OffRL
21
0
0
21 Apr 2025
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Yuchen Yan
Yongliang Shen
Y. Liu
Jin Jiang
M. Zhang
Jian Shao
Yueting Zhuang
LRM
ReLM
53
3
0
09 Mar 2025
An Empirical Study on Eliciting and Improving R1-like Reasoning Models
Z. Chen
Yingqian Min
Beichen Zhang
Jie Chen
Jinhao Jiang
...
Xu Miao
Y. Lu
Lei Fang
Zhongyuan Wang
Ji-Rong Wen
ReLM
OffRL
LRM
73
14
0
06 Mar 2025
Iterative Deepening Sampling for Large Language Models
Iterative Deepening Sampling for Large Language Models
Weizhe Chen
Sven Koenig
B. Dilkina
LRM
ReLM
86
0
0
08 Feb 2025
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Jinyang Wu
Mingkuan Feng
Shuai Zhang
Ruihan Jin
Feihu Che
Zengqi Wen
J. Tao
LRM
55
7
0
04 Feb 2025
1