Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.15068
Cited By
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation
18 June 2025
Zongxia Li
Yapei Chang
Yuhang Zhou
Xiyang Wu
Zichao Liang
Yoo Yeon Sung
Jordan L. Boyd-Graber
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (14 upvotes)
Papers citing
"Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation"
5 / 5 papers shown
Guided Self-Evolving LLMs with Minimal Human Supervision
Wenhao Yu
Zhenwen Liang
Chengsong Huang
Kishan Panaganti
Tianqing Fang
Haitao Mi
Dong Yu
SyDa
ReLM
LRM
361
5
0
02 Dec 2025
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
Xuankun Rong
Wenke Huang
Tingfeng Wang
Daiguo Zhou
Bo Du
Mang Ye
LRM
237
0
0
17 Nov 2025
Self-Rewarding Vision-Language Model via Reasoning Decomposition
Zongxia Li
Wenhao Yu
Chengsong Huang
Rui Liu
Zhenwen Liang
...
Jingxi Che
Dian Yu
Jordan L. Boyd-Graber
Haitao Mi
Dong Yu
ReLM
VLM
LRM
149
42
0
27 Aug 2025
R-Zero: Self-Evolving Reasoning LLM from Zero Data
Chengsong Huang
Wenhao Yu
Xiaoyang Wang
H. Zhang
Zongxia Li
Ruosen Li
J. Huang
Haitao Mi
Dong Yu
ReLM
SyDa
LRM
240
52
0
07 Aug 2025
Compositional Coordination for Multi-Robot Teams with Large Language Models
Zhehui Huang
Guangyao Shi
Yuwei Wu
Vijay Kumar
Gaurav Sukhatme
LM&Ro
433
0
0
21 Jul 2025
1
Page 1 of 1