Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2407.21787
Cited By
v1
v2
v3 (latest)
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
3 January 2025
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (13 upvotes)
Github (112★)
Papers citing
"Large Language Monkeys: Scaling Inference Compute with Repeated Sampling"
50 / 420 papers shown
SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models
Thinh Pham
Nguyen Nguyen
Pratibha Zunjare
Weiyuan Chen
Yu-Min Tseng
Tu Vu
ALM
RALM
ELM
ReLM
LRM
401
34
0
10 Apr 2026
RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs
Jonathan Geuter
Gregor Kornhardt
40
0
0
05 Dec 2025
When Does Verification Pay Off? A Closer Look at LLMs as Solution Verifiers
Jack Lu
Ryan Teehan
Jinran Jin
Mengye Ren
LRM
186
4
0
02 Dec 2025
Think in Parallel, Answer as One: Logit Averaging for Open-Ended Reasoning
Haonan Wang
Chao Du
Kenji Kawaguchi
Tianyu Pang
MoMe
ReLM
LRM
449
2
0
02 Dec 2025
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
Salman Rahman
Sruthi Gorantla
Arpit Gupta
Swastik Roy
Nanyun Peng
Yang Liu
OffRL
LRM
194
0
0
02 Dec 2025
Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation
Xiaodong Cai
Hai Lin
Shaoxiong Zhan
Weiqi Luo
Hong-Gee Kim
Hongyan Hao
Yu Yang
Hai-Tao Zheng
114
0
0
30 Nov 2025
Rethinking Test Time Scaling for Flow-Matching Generative Models
Qingtao Yu
Changlin Song
Minghao Sun
Zhengyang Yu
Vinay Kumar Verma
Soumya Roy
Sumit Negi
Hongdong Li
Dylan Campbell
126
1
0
27 Nov 2025
Boosting Reasoning in Large Multimodal Models via Activation Replay
Yun Xing
Xiaobin Hu
Qingdong He
Jiangning Zhang
Shuicheng Yan
Shijian Lu
Yu-Gang Jiang
OffRL
LRM
332
1
0
25 Nov 2025
Differential Smoothing Mitigates Sharpening and Improves LLM Reasoning
Jingchu Gai
Guanning Zeng
Huaqing Zhang
Aditi Raghunathan
178
5
0
25 Nov 2025
Majority of the Bests: Improving Best-of-N via Bootstrapping
Amin Rakhsha
Kanika Madan
Tianyu Zhang
Amir-massoud Farahmand
Amir Khasahmadi
195
3
0
23 Nov 2025
Budget-Aware Tool-Use Enables Effective Agent Scaling
Tengxiao Liu
Zifeng Wang
Jin Miao
I-Hung Hsu
Jun Yan
...
Samira Daruki
Yi Liang
William Y. Wang
Tomas Pfister
Chen-Yu Lee
344
13
0
21 Nov 2025
TRIM: Scalable 3D Gaussian Diffusion Inference with Temporal and Spatial Trimming
Zeyuan Yin
Xiaoming Liu
3DGS
144
3
0
20 Nov 2025
AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization
Genghan Zhang
Shaowei Zhu
Anjiang Wei
Zhenyu Song
Allen Nie
Zhen Jia
Nandita Vijaykumar
Yida Wang
K. Olukotun
156
3
0
19 Nov 2025
On the Entropy Calibration of Language Models
Steven Cao
Gregory Valiant
Percy Liang
160
1
0
15 Nov 2025
Place Matters: Comparing LLM Hallucination Rates for Place-Based Legal Queries
Damian Curran
Vanessa Sporne
Lea Frermann
Jeannie Paterson
AILaw
ELM
436
0
0
10 Nov 2025
Reusing Pre-Training Data at Test Time is a Compute Multiplier
Alex Fang
Thomas Voice
Ruoming Pang
Ludwig Schmidt
Tom Gunter
137
0
0
06 Nov 2025
Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective
Zhuoyi Yang
Xu Guo
Tong Zhang
Huijuan Xu
Boyang Albert Li
LRM
208
2
0
01 Nov 2025
SELF-REDRAFT: Eliciting Intrinsic Exploration-Exploitation Balance in Test-Time Scaling for Code Generation
Yixiang Chen
Tianshi Zheng
Shijue Huang
Zhitao He
Yi R. Fung
167
1
0
31 Oct 2025
Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space
Sekitoshi Kanai
Tsukasa Yoshida
Hiroshi Takahashi
Haru Kuroki
Kazumune Hashimoto
148
0
0
30 Oct 2025
e1: Learning Adaptive Control of Reasoning Effort
Michael Kleinman
Matthew Trager
Alessandro Achille
Wei Xia
Stefano Soatto
LRM
276
3
0
30 Oct 2025
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Yihe Deng
I-Hung Hsu
Jun Yan
Zifeng Wang
Rujun Han
Gufeng Zhang
Yanfei Chen
Wei Wang
Tomas Pfister
Chen-Yu Lee
LRM
236
4
0
29 Oct 2025
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation
Farid Bagirov
Mikhail Arkhipov
Ksenia Sycheva
Evgeniy Glukhov
Egor Bogomolov
144
2
0
27 Oct 2025
String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation
Kou Misaki
Takuya Akiba
LRM
278
1
0
24 Oct 2025
Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning
Ravindra Aribowo Tarunokusumo
Rafael Fernandes Cunha
OffRL
ReLM
LRM
180
0
0
24 Oct 2025
Language Ranker: A Lightweight Ranking framework for LLM Decoding
Chenheng Zhang
Tianqi Du
Jizhe Zhang
Mingqing Xiao
Yifei Wang
Yisen Wang
Zhouchen Lin
ALM
228
1
0
23 Oct 2025
Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training
Mehrdad Ghassabi
Sadra Hakim
Hamidreza Baradaran Kashani
Pedram Rostami
ReLM
LRM
406
0
0
22 Oct 2025
Test-time Verification via Optimal Transport: Coverage, ROC, & Sub-optimality
Arpan Mukherjee
Marcello Bullo
Debabrota Basu
Deniz Gündüz
164
1
0
21 Oct 2025
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
S. Bian
Tao Yu
Shivaram Venkataraman
Youngsuk Park
171
1
0
21 Oct 2025
3D Optimization for AI Inference Scaling: Balancing Accuracy, Cost, and Latency
Minseok Jung
Abhas Ricky
Muhammad Rameez Chatni
248
0
0
21 Oct 2025
PlanU: Large Language Model Reasoning through Planning under Uncertainty
Ziwei Deng
Mian Deng
Chenjing Liang
Zeming Gao
Chennan Ma
Chenxing Lin
Haipeng Zhang
Songzhu Mei
Cheng-Yu Wang
Siqi Shen
234
0
0
21 Oct 2025
Online In-Context Distillation for Low-Resource Vision Language Models
Zhiqi Kang
Rahaf Aljundi
Vaggelis Dorovatas
Karteek Alahari
VLM
161
1
0
20 Oct 2025
AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators
Jacopo Tagliabue
66
1
0
20 Oct 2025
Inference-Time Compute Scaling For Flow Matching
Adam Stecklov
Noah El Rimawi-Fine
Mathieu Blanchette
196
0
0
20 Oct 2025
SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search
Dong Li
Xujiang Zhao
Linlin Yu
Yanchi Liu
Wei Cheng
Zhengzhang Chen
Zhong Chen
Feng Chen
Chen Zhao
H. Chen
LRM
241
2
0
19 Oct 2025
What Limits Agentic Systems Efficiency?
S. Bian
Minghao Yan
Anand Jayarajan
Gennady Pekhimenko
Shivaram Venkataraman
LLMAG
LRM
221
1
0
18 Oct 2025
The Road Less Traveled: Enhancing Exploration in LLMs via Sequential Sampling
Shijia Kang
Muhan Zhang
LRM
149
0
0
17 Oct 2025
CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning
Yung-Chen Tang
Pin-Yu Chen
Andrea Cavallaro
LRM
152
1
0
17 Oct 2025
Budget-aware Test-time Scaling via Discriminative Verification
Kyle Montgomery
Sijun Tan
Yuqi Chen
Siyuan Zhuang
Tianjun Zhang
Raluca A. Popa
Chenguang Wang
172
0
0
16 Oct 2025
Scaling Test-Time Compute to Achieve IOI Gold Medal with Open-Weight Models
Mehrzad Samadi
Aleksander Ficek
Sean Narenthiran
Siddhartha Jain
Wasi Uddin Ahmad
Somshubra Majumdar
Vahid Noroozi
Boris Ginsburg
LRM
139
4
0
16 Oct 2025
GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
Y. Zhang
Yu-Huan Wu
Haowei Zhang
Weiguo Li
Haokun Chen
Jingpei Wu
Guohao Li
Zhen Han
Volker Tresp
LRM
150
6
0
16 Oct 2025
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Guinan Su
Yanwu Yang
Li Shen
Lu Yin
Shiwei Liu
Jonas Geiping
MoE
KELM
244
2
0
16 Oct 2025
AutoRubric-R1V: Rubric-Based Generative Rewards for Faithful Multimodal Reasoning
Mengzhao Jia
Zhihan Zhang
Ignacio Cases
Zheyuan Liu
Meng Jiang
Peng Qi
LRM
196
2
0
16 Oct 2025
A Survey on Parallel Reasoning
Z. Wang
Boye Niu
Zipeng Gao
Zhi Zheng
Tong Xu
...
Yilong Chen
Chen Zhu
Hua Wu
Haifeng Wang
Enhong Chen
ReLM
LRM
222
5
0
14 Oct 2025
Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models
Junhyuck Kim
Ethan Ewer
Taehong Moon
Jongho Park
Dimitris Papailiopoulos
MQ
LRM
207
1
0
13 Oct 2025
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
Hongwei Chen
Yishu Lei
Dan Zhang
Bo Ke
Danxiang Zhu
...
Shikun Feng
Jingzhou He
Yu Sun
Hua Wu
Haifeng Wang
ReLM
LRM
173
1
0
11 Oct 2025
LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
Kaijian Zou
Aaron Xiong
Yunxiang Zhang
Frederick Zhang
Yueqi Ren
Jirong Yang
Ayoung Lee
Shitanshu Bhushan
Lu Wang
ReLM
ALM
ELM
LRM
540
1
0
10 Oct 2025
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
Y. Zhang
Muhammad Khalifa
Lechen Zhang
Xin Liu
Ayoung Lee
Xinliang Frederick Zhang
Farima Fatahi Bayat
L. Wang
RALM
LRM
135
4
0
10 Oct 2025
First Try Matters: Revisiting the Role of Reflection in Reasoning Models
Liwei Kang
Yue Deng
Yao Xiao
Zhanfeng Mo
Wee Sun Lee
Lidong Bing
LRM
152
10
0
09 Oct 2025
ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs
Fu Chen
Peng Wang
X. Li
Wen Li
Shichi Lei
Dongdong Xiang
141
5
0
09 Oct 2025
DeepPrune: Parallel Scaling without Inter-trace Redundancy
Shangqing Tu
Yaxuan Li
Yushi Bai
Lei Hou
Juanzi Li
ReLM
LRM
157
2
0
09 Oct 2025
1
2
3
4
5
6
7
8
9
Next
Page 1 of 9