ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.03553
  4. Cited By
AlphaMath Almost Zero: process Supervision without process

AlphaMath Almost Zero: process Supervision without process

6 May 2024
Guoxin Chen
Minpeng Liao
Chengxi Li
Kai Fan
    AIMat
    LRM
ArXivPDFHTML

Papers citing "AlphaMath Almost Zero: process Supervision without process"

18 / 18 papers shown
Title
Accelerating Large Language Model Reasoning via Speculative Search
Accelerating Large Language Model Reasoning via Speculative Search
Zhihai Wang
Jie Wang
Jilai Pan
Xilin Xia
Huiling Zhen
M. Yuan
Jianye Hao
Feng Wu
ReLM
LRM
54
0
0
03 May 2025
Weight Ensembling Improves Reasoning in Language Models
Weight Ensembling Improves Reasoning in Language Models
Xingyu Dang
Christina Baek
Kaiyue Wen
Zico Kolter
Aditi Raghunathan
MoMe
LRM
60
1
0
14 Apr 2025
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
Yuchen Yan
Yongliang Shen
Y. Liu
Jin Jiang
M. Zhang
Jian Shao
Yueting Zhuang
LRM
ReLM
53
3
0
09 Mar 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Yancheng He
Shilong Li
J. Liu
Weixun Wang
Xingyuan Bu
...
Zhongyuan Peng
Z. Zhang
Zhicheng Zheng
Wenbo Su
Bo Zheng
ELM
LRM
65
6
0
26 Feb 2025
Bag of Tricks for Inference-time Computation of LLM Reasoning
Bag of Tricks for Inference-time Computation of LLM Reasoning
Fan Liu
Wenshuo Chao
Naiqiang Tan
Hao Liu
OffRL
LRM
69
3
0
11 Feb 2025
Iterative Deepening Sampling for Large Language Models
Iterative Deepening Sampling for Large Language Models
Weizhe Chen
Sven Koenig
B. Dilkina
LRM
ReLM
86
0
0
08 Feb 2025
COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models
COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models
Tobias Materzok
LRM
65
0
0
28 Jan 2025
Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains
Domaino1s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains
Xu Chu
Zhijie Tan
Hanlin Xue
Guanyu Wang
Tong Mo
Weiping Li
ELM
LRM
51
1
0
24 Jan 2025
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
Sebastian Farquhar
Vikrant Varma
David Lindner
David Elson
Caleb Biddulph
Ian Goodfellow
Rohin Shah
74
1
0
22 Jan 2025
Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search
Shuangtao Li
Shuaihao Dong
Kexin Luan
Xinhan Di
Chaofan Ding
LRM
39
1
0
02 Jan 2025
Markov Chain of Thought for Efficient Mathematical Reasoning
Markov Chain of Thought for Efficient Mathematical Reasoning
Wen Yang
Kai Fan
Minpeng Liao
LRM
37
4
0
23 Oct 2024
Learning Evolving Tools for Large Language Models
Learning Evolving Tools for Large Language Models
Guoxin Chen
Zhong Zhang
Xin Cong
Fangda Guo
Yesai Wu
Yankai Lin
Wenzheng Feng
Yasheng Wang
KELM
52
1
0
09 Oct 2024
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical
  dataset evaluation toolkit
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit
Boning Zhang
Chengxi Li
Kai Fan
ELM
35
10
0
22 Apr 2024
MathGenie: Generating Synthetic Data with Question Back-translation for
  Enhancing Mathematical Reasoning of LLMs
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
Zimu Lu
Aojun Zhou
Houxing Ren
Ke Wang
Weikang Shi
Junting Pan
Mingjie Zhan
Hongsheng Li
SyDa
LRM
45
42
0
26 Feb 2024
Don't Forget Your Reward Values: Language Model Alignment via
  Value-based Calibration
Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration
Xin Mao
Fengming Li
Huimin Xu
Wei Zhang
A. Luu
ALM
31
6
0
25 Feb 2024
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
1