ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.08388
  4. Cited By
v1v2v3 (latest)

Reinforcement Learning Teachers of Test Time Scaling

10 June 2025
Edoardo Cetin
Tianyu Zhao
Yujin Tang
    OffRLReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)Github (347★)

Papers citing "Reinforcement Learning Teachers of Test Time Scaling"

3 / 3 papers shown
Learning to Orchestrate Agents in Natural Language with the Conductor
Learning to Orchestrate Agents in Natural Language with the Conductor
Stefan Nielsen
Edoardo Cetin
Peter Schwendeman
Qi Sun
Jinglue Xu
Yujin Tang
LLMAG
111
1
0
04 Dec 2025
RAVR: Reference-Answer-guided Variational Reasoning for Large Language Models
RAVR: Reference-Answer-guided Variational Reasoning for Large Language Models
T. Lin
Xi Zhao
Xingyao Zhang
Rujiao Long
Yi Xu
Zhuoren Jiang
Wenbo Su
B. Zheng
LRM
122
1
0
29 Oct 2025
Variational Reasoning for Language Models
Variational Reasoning for Language Models
Xiangxin Zhou
Zichen Liu
Haonan Wang
Chao Du
Min Lin
Chongxuan Li
Liang Wang
Tianyu Pang
OffRLLRM
213
0
0
26 Sep 2025
1
Page 1 of 1