Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.02669
Cited By
Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping
5 November 2020
Yujing Hu
Weixun Wang
Hangtian Jia
Yixiang Wang
Yingfeng Chen
Jianye Hao
Feng Wu
Changjie Fan
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping"
50 / 94 papers shown
Sharpness-Guided Group Relative Policy Optimization via Probability Shaping
Tue Le
Nghi D.Q.Bui
Linh Ngo Van
267
0
0
29 Oct 2025
Balancing Specialization and Centralization: A Multi-Agent Reinforcement Learning Benchmark for Sequential Industrial Control
Tom Maus
Asma Atamna
Tobias Glasmachers
OffRL
102
0
0
23 Oct 2025
Fine-tuning Flow Matching Generative Models with Intermediate Feedback
Jiajun Fan
Chaoran Cheng
Shuaike Shen
Xiangxin Zhou
Ge Liu
EGVM
254
3
0
20 Oct 2025
Finite-time Convergence Analysis of Actor-Critic with Evolving Reward
Rui Hu
Yu Chen
Longbo Huang
177
0
0
14 Oct 2025
Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony
H. Lu
Zichen Liu
Shaopan Xiong
Yancheng He
W. Gao
...
Wei Wang
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
OffRL
124
2
0
13 Oct 2025
Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention
Yichi Zhang
Yue Ding
Jingwen Yang
Tianwei Luo
Dongbai Li
Ranjie Duan
Qiang Liu
Hang Su
Yinpeng Dong
Jun Zhu
LRM
195
3
0
29 Sep 2025
Preference-Guided Learning for Sparse-Reward Multi-Agent Reinforcement Learning
Viet The Bui
Tien Mai
Hong Thanh Nguyen
OffRL
287
0
0
26 Sep 2025
Orchestrate, Generate, Reflect: A VLM-Based Multi-Agent Collaboration Framework for Automated Driving Policy Learning
Zengqi Peng
Yusen Xie
Yubin Wang
Rui Yang
Qifeng Chen
Jun Ma
146
0
0
21 Sep 2025
Tree-Guided Diffusion Planner
Hyeonseong Jeon
Cheolhong Min
Jaesik Park
287
1
0
29 Aug 2025
Stabilizing Long-term Multi-turn Reinforcement Learning with Gated Rewards
Zetian Sun
Dongfang Li
Zhuoen Chen
Yuhuai Qin
Baotian Hu
OffRL
152
2
0
14 Aug 2025
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Zihe Liu
Jiashun Liu
Yancheng He
Weixun Wang
Jiaheng Liu
...
Siran Yang
Jiamang Wang
Yuchi Xu
Bo Zheng
B. Zheng
OffRL
197
41
0
11 Aug 2025
Self-Adapting Language Models
Adam Zweiger
Jyothish Pari
Han Guo
Ekin Akyürek
Yoon Kim
Pulkit Agrawal
KELM
LRM
688
33
0
12 Jun 2025
AURA: Autonomous Upskilling with Retrieval-Augmented Agents
Alvin Zhu
Yusuke Tanaka
Andrew Goldberg
Dennis W. Hong
398
2
0
03 Jun 2025
Distributionally Robust Deep Q-Learning
Chung I Lu
Julian Sester
Aijia Zhang
OOD
382
2
0
25 May 2025
CCL: Collaborative Curriculum Learning for Sparse-Reward Multi-Agent Reinforcement Learning via Co-evolutionary Task Evolution
International Conference on Intelligent Computing (ICIC), 2025
Yufei Lin
Chengwei Ye
Ning Yang
Kangsheng Wang
Linuo Xu
Shuyan Liu
Zeyu Zhang
316
3
0
08 May 2025
Learning Explainable Dense Reward Shapes via Bayesian Optimization
Ryan Koo
Ian Yang
Vipul Raheja
Mingyi Hong
Kwang-Sung Jun
Dongyeop Kang
318
2
0
22 Apr 2025
Post-Convergence Sim-to-Real Policy Transfer: A Principled Alternative to Cherry-Picking
Dylan Khor
Bowen Weng
377
1
0
21 Apr 2025
Towards Fully Automated Decision-Making Systems for Greenhouse Control: Challenges and Opportunities
Yongshuai Liu
Taeyeong Choi
Xin Liu
AI4CE
342
0
0
27 Mar 2025
KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies
Shih-Min Yang
Martin Magnusson
J. A. Stork
Todor Stoyanov
334
1
0
23 Mar 2025
Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards
Computer Vision and Pattern Recognition (CVPR), 2025
Zijing Hu
Tai-wei Chang
Long Chen
Kun Kuang
Jiahui Li
Kaifeng Gao
Jun Xiao
X. Wang
Wenwu Zhu
EGVM
717
25
0
14 Mar 2025
Curiosity-Driven Imagination: Discovering Plan Operators and Learning Associated Policies for Open-World Adaptation
IEEE International Conference on Robotics and Automation (ICRA), 2025
Pierrick Lorang
Hong Lu
Matthias Scheutz
373
3
0
06 Mar 2025
Closing the Intent-to-Behavior Gap via Fulfillment Priority Logic
B. Mabsout
Abdelrahman AbdelGawad
R. Mancuso
541
2
0
04 Mar 2025
Reinforcement learning Based Automated Design of Differential Evolution Algorithm for Black-box Optimization
Xu Yang
Rui Wang
Kaiwen Li
Ling Wang
257
0
0
22 Jan 2025
Blockchain-assisted Demonstration Cloning for Multi-Agent Deep Reinforcement Learning
IEEE Internet of Things Journal (IEEE IoT J.), 2024
Ahmed Alagha
Jamal Bentahar
Hadi Otrok
Shakti Singh
R. Mizouni
329
3
0
19 Jan 2025
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
AAAI Conference on Artificial Intelligence (AAAI), 2024
Yun Qu
Yuhang Jiang
Boyuan Wang
Yixiu Mao
Cheems Wang
Yu Xie
Xiangyang Ji
555
24
0
10 Jan 2025
Fairness in Reinforcement Learning with Bisimulation Metrics
S. Rezaei-Shoshtari
Hanna Yurchyk
Scott Fujimoto
Doina Precup
David Meger
547
0
0
03 Jan 2025
Bootstrapped Reward Shaping
AAAI Conference on Artificial Intelligence (AAAI), 2025
Jacob Adamczyk
Volodymyr Makarenko
Stas Tiomkin
R. Kulkarni
OffRL
279
6
0
02 Jan 2025
Comprehensive Overview of Reward Engineering and Shaping in Advancing Reinforcement Learning Applications
IEEE Access (IEEE Access), 2024
Sinan Ibrahim
Mostafa Mostafa
Ali Jnadi
Hadi Salloum
Pavel Osinenko
OffRL
431
78
0
31 Dec 2024
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
International Conference on Learning Representations (ICLR), 2024
Jasmine Bayrooti
Carl Henrik Ek
Amanda Prorok
525
4
0
07 Oct 2024
ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control
Ehsan Futuhi
Shayan Karimi
Chao Gao
Martin Müller
448
4
0
07 Oct 2024
Enhancing Inverse Reinforcement Learning through Encoding Dynamic Information in Reward Shaping
S. Zhan
Qingyuan Wu
Qingyuan Wu
Yixuan Wang
Ruochen Jiao
Chao Huang
Qi Zhu
352
3
0
04 Oct 2024
ELO-Rated Sequence Rewards: Advancing Reinforcement Learning Models
Qi Ju
Falin Hei
Zhemei Fang
Yunfeng Luo
526
1
0
05 Sep 2024
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
International Conference on Learning Representations (ICLR), 2024
Haozhe Ma
Zhengding Luo
Thanh Vinh Vo
Kuankuan Sima
Tze-Yun Leong
824
20
0
06 Aug 2024
Principal-Agent Reinforcement Learning
Dima Ivanov
Paul Dutting
Inbal Talgam-Cohen
Tonghan Wang
David C. Parkes
240
0
0
25 Jul 2024
Automatic Environment Shaping is the Next Frontier in RL
Younghyo Park
G. Margolis
Pulkit Agrawal
OffRL
428
6
0
23 Jul 2024
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning
Davide Corsi
Davide Camponogara
Alessandro Farinelli
OffRL
227
4
0
30 May 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
500
10
0
30 May 2024
Reinforcement learning
Florentin Wörgötter
731
3,220
0
16 May 2024
Enhancing Q-Learning with Large Language Model Heuristics
Xiefeng Wu
LRM
397
2
0
06 May 2024
On the Sample Efficiency of Abstractions and Potential-Based Reward Shaping in Reinforcement Learning
Giuseppe Canonaco
Leo Ardon
Alberto Pozanco
Daniel Borrajo
OffRL
351
2
0
11 Apr 2024
Extremum-Seeking Action Selection for Accelerating Policy Optimization
IEEE International Conference on Robotics and Automation (ICRA), 2024
Ya-Chien Chang
Sicun Gao
318
0
0
02 Apr 2024
Decomposing Control Lyapunov Functions for Efficient Reinforcement Learning
American Control Conference (ACC), 2024
Antonio Lopez
David Fridovich-Keil
295
3
0
18 Mar 2024
Transformable Gaussian Reward Function for Socially-Aware Navigation with Deep Reinforcement Learning
Jinyeob Kim
Sumin Kang
Sungwoo Yang
Beomjoon Kim
Jargalbaatar Yura
Donghan Kim
900
2
0
22 Feb 2024
Auxiliary Reward Generation with Transition Distance Representation Learning
Siyuan Li
Shijie Han
Yingnan Zhao
B. Liang
Peng Liu
OffRL
252
0
0
12 Feb 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
International Conference on Machine Learning (ICML), 2024
Han Shen
Zhuoran Yang
Tianyi Chen
OffRL
432
33
0
10 Feb 2024
Reinforcement Learning from Bagged Reward
Yuting Tang
Xin-Qiang Cai
Yao-Xiang Ding
Qiyu Wu
Guoqing Liu
Masashi Sugiyama
OffRL
397
0
0
06 Feb 2024
Principal-Agent Reward Shaping in MDPs
AAAI Conference on Artificial Intelligence (AAAI), 2023
Omer Ben-Porat
Yishay Mansour
Michal Moshkovitz
Boaz Taitler
256
20
0
30 Dec 2023
Human-AI Collaboration in Real-World Complex Environment with Reinforcement Learning
Md Saiful Islam
Srijita Das
S. Gottipati
William Duguay
Clodéric Mars
Jalal Arabneydi
Antoine Fagette
Matthew J. Guzdial
Matthew E. Taylor
238
5
0
23 Dec 2023
Toward Computationally Efficient Inverse Reinforcement Learning via Reward Shaping
Lauren H. Cooke
Harvey Klyne
Edwin Zhang
Cassidy Laidlaw
Milind Tambe
Finale Doshi-Velez
430
2
0
15 Dec 2023
FoMo Rewards: Can we cast foundation models as reward functions?
Ekdeep Singh Lubana
Johann Brehmer
P. D. Haan
Taco S. Cohen
OffRL
LRM
303
5
0
06 Dec 2023
1
2
Next
Page 1 of 2