ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.21787
  4. Cited By
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
v1v2v3 (latest)

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

3 January 2025
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
    ALMLRM
ArXiv (abs)PDFHTMLHuggingFace (13 upvotes)Github (112★)

Papers citing "Large Language Monkeys: Scaling Inference Compute with Repeated Sampling"

50 / 420 papers shown
SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models
SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models
Thinh Pham
Nguyen Nguyen
Pratibha Zunjare
Weiyuan Chen
Yu-Min Tseng
Tu Vu
ALMRALMELMReLMLRM
401
34
0
10 Apr 2026
RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs
RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs
Jonathan Geuter
Gregor Kornhardt
40
0
0
05 Dec 2025
When Does Verification Pay Off? A Closer Look at LLMs as Solution Verifiers
When Does Verification Pay Off? A Closer Look at LLMs as Solution Verifiers
Jack Lu
Ryan Teehan
Jinran Jin
Mengye Ren
LRM
186
4
0
02 Dec 2025
Think in Parallel, Answer as One: Logit Averaging for Open-Ended Reasoning
Think in Parallel, Answer as One: Logit Averaging for Open-Ended Reasoning
Haonan Wang
Chao Du
Kenji Kawaguchi
Tianyu Pang
MoMeReLMLRM
449
2
0
02 Dec 2025
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
Salman Rahman
Sruthi Gorantla
Arpit Gupta
Swastik Roy
Nanyun Peng
Yang Liu
OffRLLRM
194
0
0
02 Dec 2025
Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation
Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation
Xiaodong Cai
Hai Lin
Shaoxiong Zhan
Weiqi Luo
Hong-Gee Kim
Hongyan Hao
Yu Yang
Hai-Tao Zheng
114
0
0
30 Nov 2025
Rethinking Test Time Scaling for Flow-Matching Generative Models
Rethinking Test Time Scaling for Flow-Matching Generative Models
Qingtao Yu
Changlin Song
Minghao Sun
Zhengyang Yu
Vinay Kumar Verma
Soumya Roy
Sumit Negi
Hongdong Li
Dylan Campbell
126
1
0
27 Nov 2025
Boosting Reasoning in Large Multimodal Models via Activation Replay
Boosting Reasoning in Large Multimodal Models via Activation Replay
Yun Xing
Xiaobin Hu
Qingdong He
Jiangning Zhang
Shuicheng Yan
Shijian Lu
Yu-Gang Jiang
OffRLLRM
332
1
0
25 Nov 2025
Differential Smoothing Mitigates Sharpening and Improves LLM Reasoning
Differential Smoothing Mitigates Sharpening and Improves LLM Reasoning
Jingchu Gai
Guanning Zeng
Huaqing Zhang
Aditi Raghunathan
178
5
0
25 Nov 2025
Majority of the Bests: Improving Best-of-N via Bootstrapping
Majority of the Bests: Improving Best-of-N via Bootstrapping
Amin Rakhsha
Kanika Madan
Tianyu Zhang
Amir-massoud Farahmand
Amir Khasahmadi
195
3
0
23 Nov 2025
Budget-Aware Tool-Use Enables Effective Agent Scaling
Budget-Aware Tool-Use Enables Effective Agent Scaling
Tengxiao Liu
Zifeng Wang
Jin Miao
I-Hung Hsu
Jun Yan
...
Samira Daruki
Yi Liang
William Y. Wang
Tomas Pfister
Chen-Yu Lee
344
13
0
21 Nov 2025
TRIM: Scalable 3D Gaussian Diffusion Inference with Temporal and Spatial Trimming
TRIM: Scalable 3D Gaussian Diffusion Inference with Temporal and Spatial Trimming
Zeyuan Yin
Xiaoming Liu
3DGS
144
3
0
20 Nov 2025
AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization
AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization
Genghan Zhang
Shaowei Zhu
Anjiang Wei
Zhenyu Song
Allen Nie
Zhen Jia
Nandita Vijaykumar
Yida Wang
K. Olukotun
156
3
0
19 Nov 2025
On the Entropy Calibration of Language Models
On the Entropy Calibration of Language Models
Steven Cao
Gregory Valiant
Percy Liang
160
1
0
15 Nov 2025
Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries
Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries
Damian Curran
Vanessa Sporne
Lea Frermann
Jeannie Paterson
AILawELM
436
0
0
10 Nov 2025
Reusing Pre-Training Data at Test Time is a Compute Multiplier
Reusing Pre-Training Data at Test Time is a Compute Multiplier
Alex Fang
Thomas Voice
Ruoming Pang
Ludwig Schmidt
Tom Gunter
137
0
0
06 Nov 2025
Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective
Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective
Zhuoyi Yang
Xu Guo
Tong Zhang
Huijuan Xu
Boyang Albert Li
LRM
208
2
0
01 Nov 2025
SELF-REDRAFT: Eliciting Intrinsic Exploration-Exploitation Balance in Test-Time Scaling for Code Generation
SELF-REDRAFT: Eliciting Intrinsic Exploration-Exploitation Balance in Test-Time Scaling for Code Generation
Yixiang Chen
Tianshi Zheng
Shijue Huang
Zhitao He
Yi R. Fung
167
1
0
31 Oct 2025
Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space
Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space
Sekitoshi Kanai
Tsukasa Yoshida
Hiroshi Takahashi
Haru Kuroki
Kazumune Hashimoto
148
0
0
30 Oct 2025
e1: Learning Adaptive Control of Reasoning Effort
e1: Learning Adaptive Control of Reasoning Effort
Michael Kleinman
Matthew Trager
Alessandro Achille
Wei Xia
Stefano Soatto
LRM
276
3
0
30 Oct 2025
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Yihe Deng
I-Hung Hsu
Jun Yan
Zifeng Wang
Rujun Han
Gufeng Zhang
Yanfei Chen
Wei Wang
Tomas Pfister
Chen-Yu Lee
LRM
236
4
0
29 Oct 2025
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation
Farid Bagirov
Mikhail Arkhipov
Ksenia Sycheva
Evgeniy Glukhov
Egor Bogomolov
144
2
0
27 Oct 2025
String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation
String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation
Kou Misaki
Takuya Akiba
LRM
278
1
0
24 Oct 2025
Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning
Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning
Ravindra Aribowo Tarunokusumo
Rafael Fernandes Cunha
OffRLReLMLRM
180
0
0
24 Oct 2025
Language Ranker: A Lightweight Ranking framework for LLM Decoding
Language Ranker: A Lightweight Ranking framework for LLM Decoding
Chenheng Zhang
Tianqi Du
Jizhe Zhang
Mingqing Xiao
Yifei Wang
Yisen Wang
Zhouchen Lin
ALM
228
1
0
23 Oct 2025
Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training
Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training
Mehrdad Ghassabi
Sadra Hakim
Hamidreza Baradaran Kashani
Pedram Rostami
ReLMLRM
406
0
0
22 Oct 2025
Test-time Verification via Optimal Transport: Coverage, ROC, & Sub-optimality
Test-time Verification via Optimal Transport: Coverage, ROC, & Sub-optimality
Arpan Mukherjee
Marcello Bullo
Debabrota Basu
Deniz Gündüz
164
1
0
21 Oct 2025
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
S. Bian
Tao Yu
Shivaram Venkataraman
Youngsuk Park
171
1
0
21 Oct 2025
3D Optimization for AI Inference Scaling: Balancing Accuracy, Cost, and Latency
3D Optimization for AI Inference Scaling: Balancing Accuracy, Cost, and Latency
Minseok Jung
Abhas Ricky
Muhammad Rameez Chatni
248
0
0
21 Oct 2025
PlanU: Large Language Model Reasoning through Planning under Uncertainty
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng
Mian Deng
Chenjing Liang
Zeming Gao
Chennan Ma
Chenxing Lin
Haipeng Zhang
Songzhu Mei
Cheng-Yu Wang
Siqi Shen
234
0
0
21 Oct 2025
Online In-Context Distillation for Low-Resource Vision Language Models
Online In-Context Distillation for Low-Resource Vision Language Models
Zhiqi Kang
Rahaf Aljundi
Vaggelis Dorovatas
Karteek Alahari
VLM
161
1
0
20 Oct 2025
AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators
AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators
Jacopo Tagliabue
66
1
0
20 Oct 2025
Inference-Time Compute Scaling For Flow Matching
Inference-Time Compute Scaling For Flow Matching
Adam Stecklov
Noah El Rimawi-Fine
Mathieu Blanchette
196
0
0
20 Oct 2025
SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search
SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search
Dong Li
Xujiang Zhao
Linlin Yu
Yanchi Liu
Wei Cheng
Zhengzhang Chen
Zhong Chen
Feng Chen
Chen Zhao
H. Chen
LRM
241
2
0
19 Oct 2025
What Limits Agentic Systems Efficiency?
What Limits Agentic Systems Efficiency?
S. Bian
Minghao Yan
Anand Jayarajan
Gennady Pekhimenko
Shivaram Venkataraman
LLMAGLRM
221
1
0
18 Oct 2025
The Road Less Traveled: Enhancing Exploration in LLMs via Sequential Sampling
The Road Less Traveled: Enhancing Exploration in LLMs via Sequential Sampling
Shijia Kang
Muhan Zhang
LRM
149
0
0
17 Oct 2025
CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning
CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning
Yung-Chen Tang
Pin-Yu Chen
Andrea Cavallaro
LRM
152
1
0
17 Oct 2025
Budget-aware Test-time Scaling via Discriminative Verification
Budget-aware Test-time Scaling via Discriminative Verification
Kyle Montgomery
Sijun Tan
Yuqi Chen
Siyuan Zhuang
Tianjun Zhang
Raluca A. Popa
Chenguang Wang
172
0
0
16 Oct 2025
Scaling Test-Time Compute to Achieve IOI Gold Medal with Open-Weight Models
Scaling Test-Time Compute to Achieve IOI Gold Medal with Open-Weight Models
Mehrzad Samadi
Aleksander Ficek
Sean Narenthiran
Siddhartha Jain
Wasi Uddin Ahmad
Somshubra Majumdar
Vahid Noroozi
Boris Ginsburg
LRM
139
4
0
16 Oct 2025
GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
Y. Zhang
Yu-Huan Wu
Haowei Zhang
Weiguo Li
Haokun Chen
Jingpei Wu
Guohao Li
Zhen Han
Volker Tresp
LRM
150
6
0
16 Oct 2025
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Guinan Su
Yanwu Yang
Li Shen
Lu Yin
Shiwei Liu
Jonas Geiping
MoEKELM
244
2
0
16 Oct 2025
AutoRubric-R1V: Rubric-Based Generative Rewards for Faithful Multimodal Reasoning
AutoRubric-R1V: Rubric-Based Generative Rewards for Faithful Multimodal Reasoning
Mengzhao Jia
Zhihan Zhang
Ignacio Cases
Zheyuan Liu
Meng Jiang
Peng Qi
LRM
196
2
0
16 Oct 2025
A Survey on Parallel Reasoning
A Survey on Parallel Reasoning
Z. Wang
Boye Niu
Zipeng Gao
Zhi Zheng
Tong Xu
...
Yilong Chen
Chen Zhu
Hua Wu
Haifeng Wang
Enhong Chen
ReLMLRM
222
5
0
14 Oct 2025
Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models
Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models
Junhyuck Kim
Ethan Ewer
Taehong Moon
Jongho Park
Dimitris Papailiopoulos
MQLRM
207
1
0
13 Oct 2025
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
Hongwei Chen
Yishu Lei
Dan Zhang
Bo Ke
Danxiang Zhu
...
Shikun Feng
Jingzhou He
Yu Sun
Hua Wu
Haifeng Wang
ReLMLRM
173
1
0
11 Oct 2025
LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
Kaijian Zou
Aaron Xiong
Yunxiang Zhang
Frederick Zhang
Yueqi Ren
Jirong Yang
Ayoung Lee
Shitanshu Bhushan
Lu Wang
ReLMALMELMLRM
540
1
0
10 Oct 2025
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
Y. Zhang
Muhammad Khalifa
Lechen Zhang
Xin Liu
Ayoung Lee
Xinliang Frederick Zhang
Farima Fatahi Bayat
L. Wang
RALMLRM
135
4
0
10 Oct 2025
First Try Matters: Revisiting the Role of Reflection in Reasoning Models
First Try Matters: Revisiting the Role of Reflection in Reasoning Models
Liwei Kang
Yue Deng
Yao Xiao
Zhanfeng Mo
Wee Sun Lee
Lidong Bing
LRM
152
10
0
09 Oct 2025
ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs
ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs
Fu Chen
Peng Wang
X. Li
Wen Li
Shichi Lei
Dongdong Xiang
141
5
0
09 Oct 2025
DeepPrune: Parallel Scaling without Inter-trace Redundancy
DeepPrune: Parallel Scaling without Inter-trace Redundancy
Shangqing Tu
Yaxuan Li
Yushi Bai
Lei Hou
Juanzi Li
ReLMLRM
157
2
0
09 Oct 2025
123456789
Next
Page 1 of 9