Papers
Communities
Organizations
Events
Blog
Pricing
Feedback
Contact Sales
Search
Open menu
Home
Papers
2505.21600
Cited By
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
27 May 2025
Tianyu Fu
Yi Ge
Yichen You
Enshu Liu
Zhihang Yuan
Guohao Dai
Shengen Yan
Huazhong Yang
Yu Wang
MoE
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (71 upvotes)
Papers citing
"R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"
17 / 17 papers shown
Title
Semantic Energy: Detecting LLM Hallucination Beyond Entropy
H. Ma
Jiadong Pan
Jing Liu
Yan Chen
Joey Tianyi Zhou
Guangyu Wang
Qinghua Hu
Hua Wu
Changqing Zhang
Haifeng Wang
24
1
0
20 Aug 2025
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models
Linan Yue
Yichao Du
Yizhi Wang
W. Gao
Fangzhou Yao
...
Ye Liu
Ziyu Xu
Qi Liu
Shimin Di
Johan Sulaeman
LRM
48
2
0
04 Aug 2025
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
Tianchen Zhao
Ke Hong
Xinhao Yang
Xuefeng Xiao
Huixia Li
...
Ruiqi Xie
Siqi Chen
Hongyu Zhu
Xicheng Zhang
Yu Wang
MQ
VGen
99
1
0
19 Jun 2025
PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs
Tengxuan Liu
Shiyao Li
Jiayi Yang
Tianchen Zhao
Feng Zhou
Xiaohui Song
Guohao Dai
Shengen Yan
Huazhong Yang
Yu Wang
MQ
84
2
0
24 May 2025
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
Jintao Zhang
Jia Wei
Pengle Zhang
Xiaoming Xu
Haofeng Huang
Haoxu Wang
Kai Jiang
Jun Zhu
Jianfei Chen
MQ
106
15
0
16 May 2025
SplitReason: Learning To Offload Reasoning
Yash Akhauri
Anthony Fei
Chi-chih Chang
Ahmed F. AbouElhamayed
Yueying Li
Mohamed S. Abdelfattah
OffRL
ReLM
LRM
147
1
0
23 Apr 2025
Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time
Wang Yang
Xiang Yue
Vipin Chaudhary
Xiaotian Han
ReLM
LRM
204
18
0
12 Apr 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
334
68
0
27 Mar 2025
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Yang Sui
Yu-Neng Chuang
Guanchu Wang
Jiamu Zhang
Tianyi Zhang
...
Andrew Wen
Shaochen
Zhong
Hanjie Chen
Helen Zhou
OffRL
ReLM
LRM
283
164
0
20 Mar 2025
Predicting Team Performance from Communications in Simulated Search-and-Rescue
Ali Jalal-Kamali
Nikolos Gurney
David Pynadath
AI4TS
241
0
0
05 Mar 2025
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
370
38
0
03 Mar 2025
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu
Z. L. Jiang
Qingbin Liu
Yulun Du
Tao Jiang
...
N. Zhang
Zhilin Yang
Xinyu Zhou
Mingxing Zhang
J. Qiu
159
47
0
18 Feb 2025
MixLLM: Dynamic Routing in Mixed Large Language Models
Xinyuan Wang
Yanchi Liu
Wei Cheng
Xujiang Zhao
Zhe Chen
Wenchao Yu
Yanjie Fu
Haifeng Chen
188
18
0
09 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
582
3,403
0
22 Jan 2025
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang
Jia Wei
Pengle Zhang
Jun-Jie Zhu
Jun Zhu
Jianfei Chen
VLM
MQ
288
52
0
03 Oct 2024
RouteLLM: Learning to Route LLMs with Preference Data
Isaac Ong
Amjad Almahairi
Vincent Wu
Wei-Lin Chiang
Tianhao Wu
Joseph E. Gonzalez
M. W. Kadous
Ion Stoica
242
144
0
26 Jun 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
304
213
0
26 Jan 2024
1