Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,103 papers shown
Title
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models
Heng Lu
Mehdi Alemi
Reza Rawassizadeh
54
1
0
05 Jul 2024
Simplifying Deep Temporal Difference Learning
Matteo Gallici
Mattie Fellows
Benjamin Ellis
B. Pou
Ivan Masmitja
Jakob Foerster
Mario Martin
OffRL
67
19
0
05 Jul 2024
ROER: Regularized Optimal Experience Replay
Changling Li
Zhang-Wei Hong
Pulkit Agrawal
Divyansh Garg
Joni Pajarinen
OffRL
60
1
0
04 Jul 2024
RobocupGym: A challenging continuous control benchmark in Robocup
Michael Beukman
Branden Ingram
Geraud Nangue Tasse
Benjamin Rosman
Pravesh Ranchod
OffRL
55
1
0
03 Jul 2024
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
Asaf B. Cassel
Aviv A. Rosenberg
60
1
0
03 Jul 2024
Safe CoR: A Dual-Expert Approach to Integrating Imitation Learning and Safe Reinforcement Learning Using Constraint Rewards
Hyeokjin Kwon
Gunmin Lee
Junseo Lee
Songhwai Oh
59
0
0
02 Jul 2024
Cooperative Advisory Residual Policies for Congestion Mitigation
Aamir Hasan
Neeloy Chakraborty
Haonan Chen
Jung-Hoon Cho
Cathy Wu
Katherine Driggs-Campbell
53
1
0
30 Jun 2024
Towards shutdownable agents via stochastic choice
Elliott Thornley
Alexander Roman
Christos Ziakas
Leyton Ho
Louis Thomson
60
0
0
30 Jun 2024
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
Benjamin Estermann
Luca A. Lanzendörfer
Yannick Niedermayr
Roger Wattenhofer
75
4
0
29 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
62
15
0
24 Jun 2024
Position: Benchmarking is Limited in Reinforcement Learning Research
Scott M. Jordan
Adam White
Bruno Castro da Silva
Martha White
Philip S. Thomas
OffRL
31
6
0
23 Jun 2024
Diffusion Spectral Representation for Reinforcement Learning
Dmitry Shribak
Chen-Xiao Gao
Yitong Li
Chenjun Xiao
Bo Dai
DiffM
68
3
0
23 Jun 2024
Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning
M. Radaideh
Leo Tunkle
D. Price
Kamal Abdulraheem
Linyu Lin
Moutaz Elias
29
0
0
22 Jun 2024
A General Control-Theoretic Approach for Reinforcement Learning: Theory and Algorithms
Weiqin Chen
M. Squillante
Chai Wah Wu
Santiago Paternain
AI4CE
53
0
0
20 Jun 2024
Constrained Meta Agnostic Reinforcement Learning
Karam Daaboul
Florian Kuhm
Tim Joseph
J. Marius Zoellner
61
0
0
20 Jun 2024
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs
Raeid Saqur
54
3
0
20 Jun 2024
Advantage Alignment Algorithms
Juan Agustin Duque
Milad Aghajohari
Tim Cooijmans
Tianyu Zhang
Rameswar Panda
Gauthier Gidel
Aaron Courville
37
0
0
20 Jun 2024
Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach
Y. Park
Sheikh Salman Hassan
Y. Tun
Eui-nam Huh
Walid Saad
Choong Seon Hong
40
2
0
19 Jun 2024
The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
Riccardo Zamboni
Duilio Cirino
Marcello Restelli
Mirco Mutti
66
4
0
18 Jun 2024
Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model
Siemen Herremans
Ali Anwar
Siegfried Mercelis
52
2
0
14 Jun 2024
e-COP : Episodic Constrained Optimization of Policies
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Sahil Singla
OffRL
42
1
0
13 Jun 2024
Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction
Raja Farrukh Ali
Stephanie Milani
John Woods
Emmanuel Adenij
Ayesha Farooq
Clayton Mansel
Jeffrey Burns
William Hsu
38
0
0
11 Jun 2024
CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning
Zeyuan Liu
Kai Yang
Xiu Li
OffRL
64
0
0
11 Jun 2024
Adaptive Opponent Policy Detection in Multi-Agent MDPs: Real-Time Strategy Switch Identification Using Running Error Estimation
Mohidul Haque Mridul
Mohammad Foysal Khan
Redwan Ahmed Rizvee
Md. Mosaddek Khan
AAML
26
0
0
10 Jun 2024
An Improved Empirical Fisher Approximation for Natural Gradient Descent
Xiaodong Wu
Wenyi Yu
Chao Zhang
Philip Woodland
47
4
0
10 Jun 2024
GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model
Zhehua Zhou
Xuan Xie
Jiayang Song
Zhan Shu
Lei Ma
62
1
0
06 Jun 2024
Transductive Off-policy Proximal Policy Optimization
Yaozhong Gan
Renye Yan
Xiaoyang Tan
Zhe Wu
Junliang Xing
OffRL
42
2
0
06 Jun 2024
Reflective Policy Optimization
Yaozhong Gan
Renye Yan
Zhe Wu
Junliang Xing
55
1
0
06 Jun 2024
Reinforcement learning-based architecture search for quantum machine learning
Frederic Rapp
D. Kreplin
Marco F. Huber
M. Roth
AI4CE
41
5
0
04 Jun 2024
How to Explore with Belief: State Entropy Maximization in POMDPs
Riccardo Zamboni
Duilio Cirino
Marcello Restelli
Mirco Mutti
59
3
0
04 Jun 2024
Test-Time Regret Minimization in Meta Reinforcement Learning
Mirco Mutti
Aviv Tamar
31
4
0
04 Jun 2024
Learning the Target Network in Function Space
Kavosh Asadi
Yao Liu
Shoham Sabach
Ming Yin
Rasool Fakoor
70
0
0
03 Jun 2024
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
Lenart Treven
Bhavya Sukhija
Yarden As
Florian Dorfler
Andreas Krause
68
1
0
03 Jun 2024
Learning-based legged locomotion; state of the art and future perspectives
Sehoon Ha
Joonho Lee
M. van de Panne
Zhaoming Xie
Wenhao Yu
Majid Khadiv
53
17
0
03 Jun 2024
Deep Reinforcement Learning for Sim-to-Real Policy Transfer of VTOL-UAVs Offshore Docking Operations
A. M. Ali
Aryaman Gupta
Hashim A. Hashim
OffRL
49
7
0
02 Jun 2024
Shared-unique Features and Task-aware Prioritized Sampling on Multi-task Reinforcement Learning
Po-Shao Lin
Jia-Fong Yeh
Yi-Ting Chen
Winston H. Hsu
47
0
0
02 Jun 2024
Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient
Zechu Li
Rickmer Krohn
Tao Chen
Anurag Ajay
Pulkit Agrawal
Georgia Chalvatzaki
DiffM
67
12
0
02 Jun 2024
SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems
Nathan Corecco
Giorgio Piatti
Luca A. Lanzendörfer
Flint Xiaofeng Fan
Roger Wattenhofer
OffRL
39
2
0
01 Jun 2024
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang
Ruoxue Liu
Jing Zhang
Wenjia Wang
Bing-Yi Jing
OffRL
64
5
0
31 May 2024
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu
Yang Li
Yixing Lan
Hao Gao
Wei Pan
Xin Xu
OffRL
41
5
0
30 May 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
74
2
0
30 May 2024
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
71
3
0
29 May 2024
Statistical Context Detection for Deep Lifelong Reinforcement Learning
Jeffery Dick
Saptarshi Nath
Christos Peridis
Eseoghene Ben-Iwhiwhu
Soheil Kolouri
Andrea Soltoggio
OffRL
66
0
0
29 May 2024
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
Dohyeong Kim
Taehyun Cho
Seung Han
Hojun Chung
Kyungjae Lee
Songhwai Oh
39
1
0
29 May 2024
Counterfactual Explanations for Multivariate Time-Series without Training Datasets
Xiangyu Sun
Raquel Aoki
Kevin H. Wilson
46
1
0
28 May 2024
Safe Multi-Agent Reinforcement Learning with Bilevel Optimization in Autonomous Driving
Zhi Zheng
Shangding Gu
48
3
0
28 May 2024
Mollification Effects of Policy Gradient Methods
Tao Wang
Sylvia Herbert
Sicun Gao
59
1
0
28 May 2024
Linear Function Approximation as a Computationally Efficient Method to Solve Classical Reinforcement Learning Challenges
Hari Srikanth
20
0
0
27 May 2024
Matrix Low-Rank Trust Region Policy Optimization
Sergio Rozada
Antonio G. Marques
58
0
0
27 May 2024
Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales
Ju-Seung Byun
Andrew Perrault
34
1
0
27 May 2024
Previous
1
2
3
...
5
6
7
...
61
62
63
Next