Computational Benefits of Intermediate Rewards for Goal-Reaching Policy
Learning

Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning

8 July 2021

Papers citing "Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning"

12 / 12 papers shown

Title
Sample-Efficient Reinforcement Learning with Temporal Logic Objectives: Leveraging the Task Specification to Guide Exploration Y. Kantaros Jun Wang 27 5 0 16 Oct 2024
SEAL: SEmantic-Augmented Imitation Learning via Language Model Chengyang Gu Yuxin Pan Haotian Bai Hui Xiong Yize Chen 27 0 0 03 Oct 2024
Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination Zhiyao Luo Yangchen Pan Peter Watkinson Tingting Zhu OffRL 33 0 0 28 May 2024
Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications Jun Wang Hosein Hasanbeig Kaiyuan Tan Zihe Sun Y. Kantaros 27 3 0 28 Nov 2023
SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving Xueliang Zhao Xinting Huang Wei Bi Lingpeng Kong LRM 46 0 0 19 Oct 2023
General In-Hand Object Rotation with Vision and Touch Haozhi Qi Brent Yi Sudharshan Suresh Mike Lambeta Y. Ma Roberto Calandra Jitendra Malik 58 81 0 18 Sep 2023
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving Xueliang Zhao Wenda Li Lingpeng Kong 30 28 0 25 May 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning Mitsuhiko Nakamoto Yuexiang Zhai Anika Singh Max Sobol Mark Yi-An Ma Chelsea Finn Aviral Kumar Sergey Levine OffRL OnRL 109 108 0 09 Mar 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum Qiyang Li Yuexiang Zhai Yi-An Ma Sergey Levine 32 14 0 24 Dec 2022
Learning Representations that Enable Generalization in Assistive Tasks Jerry Zhi-Yang He Aditi Raghunathan Daniel S. Brown Zackory M. Erickson Anca Dragan OOD 29 19 0 05 Dec 2022
Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity Abhishek Gupta Aldo Pacchiano Yuexiang Zhai Sham Kakade Sergey Levine OffRL 22 66 0 18 Oct 2022
Accelerated Reinforcement Learning for Temporal Logic Control Objectives Y. Kantaros 11 11 0 09 May 2022