Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.19634
Cited By
v1
v2
v3
v4 (latest)
Faster and Better LLMs via Latency-Aware Test-Time Scaling
26 May 2025
Zili Wang
Tianyu Zhang
Haoli Bai
Lu Hou
Xianzhi Yu
Wulong Liu
Shiming Xiang
Lei Zhu
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Faster and Better LLMs via Latency-Aware Test-Time Scaling"
7 / 7 papers shown
TPS-Bench: Evaluating AI Agents' Tool Planning \& Scheduling Abilities in Compounding Tasks
Hanwen Xu
Xuyao Huang
Yuzhe Liu
Kai Yu
Zhijie Deng
LLMAG
135
1
0
03 Nov 2025
Heimdall: test-time scaling on the generative verification
Wenlei Shi
Xing Jin
LRM
423
20
0
14 Apr 2025
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models
Ruikang Liu
Yuxuan Sun
Manyi Zhang
Haoli Bai
Xianzhi Yu
Tiezheng Yu
C. Yuan
Lu Hou
MQ
LRM
428
28
0
07 Apr 2025
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
601
95
0
03 Mar 2025
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
Wenkai Yang
Shuming Ma
Yankai Lin
Furu Wei
LRM
496
92
0
25 Feb 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
943
571
0
03 Jan 2025
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
International Conference on Machine Learning (ICML), 2024
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
590
319
0
26 Jan 2024
1