Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.14731
Cited By
v1
v2 (latest)
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
17 June 2025
Ling Team
Bin Hu
Cai Chen
Deng Zhao
Ding Liu
dingnan jin
Feng Zhu
Hao Dai
Hongzhi Luan
Jia Guo
Jiaming Liu
J. Wu
Jun Mei
Jun Zhou
Junbo Zhao
Junwu Xiong
Kaihong Zhang
Kuan Xu
Lei Liang
Liang Jiang
Liangcheng Fu
Longfei Zheng
Qiang Gao
Qing Cui
Quan Wan
Shaomian Zheng
Shuaicheng Li
Tongkai Yang
Wang Ren
X. Yan
Xiaopei Wan
Xiaoyun Feng
Xin Zhao
Xinxing Yang
Xinyu Kong
Xuemin Yang
Yang Li
Y. Wu
Y. Liu
Zhankai Xu
Zhenduo Zhang
Zhenglei Zhou
Zhenyu Huang
Zhiqiang Zhang
Zihao Wang
Zujie Wen
OffRL
MoE
ALM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (9 upvotes)
Papers citing
"Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs"
5 / 5 papers shown
Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts
Di Zhang
Xun Wu
Shaohan Huang
Y. Hao
Li Dong
Zewen Chi
Lei Sha
Furu Wei
Furu Wei
MoE
152
0
0
27 Oct 2025
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Ling Team
Anqi Shen
B. Li
Bin Hu
Bin Jing
...
Z. Pan
Longxiang Zhang
Zhenzhong Lan
Zhiqiang Ding
Zhiqiang Zhang
ALM
ReLM
LRM
263
5
0
21 Oct 2025
PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning
Xueliang Zhao
Wei Wu
Jian Guan
Zhuocheng Gong
Lingpeng Kong
ReLM
OffRL
LRM
AI4TS
176
1
0
24 Sep 2025
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Zhenpeng Su
Leiyu Pan
Xue Bai
Dening Liu
Guanting Dong
J. Huang
Wenping Hu
Fuzheng Zhang
Kun Gai
Guorui Zhou
ReLM
LRM
137
14
0
11 Aug 2025
Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning
Yanjun Zheng
Xiyang Du
Longfei Liao
Xiaoke Zhao
Zhaowen Zhou
...
Xiang Qi
Zhe Li
Zhiqiang Zhang
Wei Wang
Peng Zhang
AIFin
LRM
343
0
0
22 Jul 2025
1