Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.05477
Cited By
Trust Region Policy Optimization
19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trust Region Policy Optimization"
50 / 3,103 papers shown
Title
Measures of Variability for Risk-averse Policy Gradient
Yudong Luo
Yangchen Pan
Jiaqi Tan
Pascal Poupart
54
0
0
15 Apr 2025
Reasoning without Regret
Tarun Chitra
OffRL
LRM
47
0
0
14 Apr 2025
Follow the STARs: Dynamic
ω
ω
ω
-Regular Shielding of Learned Policies
Ashwani Anand
Satya Prakash Nayak
Ritam Raha
Anne-Kathrin Schmuck
17
0
0
11 Apr 2025
State Estimation Using Particle Filtering in Adaptive Machine Learning Methods: Integrating Q-Learning and NEAT Algorithms with Noisy Radar Measurements
Wonjin Song
Feng Bao
41
0
0
10 Apr 2025
Deep Reinforcement Learning for Day-to-day Dynamic Tolling in Tradable Credit Schemes
Xiaoyi Wu
Ravi Seshadri
Filipe Rodrigues
Carlos Lima Azevedo
40
0
0
10 Apr 2025
An Information-Geometric Approach to Artificial Curiosity
Alexander Nedergaard
Pablo A. Morales
38
0
0
08 Apr 2025
A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks
L. Felizardo
Edoardo Fadda
Paolo Brandimarte
E. Del-Moral-Hernandez
Mariá Cristina Vasconcelos Nascimento
OffRL
42
0
0
07 Apr 2025
Optimizing UAV Aerial Base Station Flights Using DRL-based Proximal Policy Optimization
Mario Rico Ibanez
Azim Akhtarshenas
David López-Pérez
Giovanni Geraci
30
0
0
04 Apr 2025
Hierarchical Policy-Gradient Reinforcement Learning for Multi-Agent Shepherding Control of Non-Cohesive Targets
Stefano Covone
Italo Napolitano
F. D. Lellis
Mario di Bernardo
33
0
0
03 Apr 2025
How to Adapt Control Barrier Functions? A Learning-Based Approach with Applications to a VTOL Quadplane
Taekyung Kim
Randal W. Beard
Dimitra Panagou
44
0
0
03 Apr 2025
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
73
4
0
02 Apr 2025
Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning
Llewyn Salt
Marcus Gallagher
41
1
0
02 Apr 2025
Hawkeye:Efficient Reasoning with Model Collaboration
Jianshu She
Z. Li
Zhemin Huang
Qi Li
Peiran Xu
Haonan Li
Qirong Ho
LRM
69
3
0
01 Apr 2025
Sim-is-More: Randomizing HW-NAS with Synthetic Devices
F. Capuano
Gabriele Tiboni
Niccolò Cavagnero
Giuseppe Averta
50
0
0
01 Apr 2025
Nuclear Microreactor Control with Deep Reinforcement Learning
Leo Tunkle
Kamal Abdulraheem
Linyu Lin
M. Radaideh
55
0
0
31 Mar 2025
RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations
Enrico Marchesini
Benjamin Donnot
Constance Crozier
Ian Dytham
Christian Merz
Lars Schewe
Nico Westerbeck
Cathy Wu
Antoine Marot
P. Donti
OffRL
64
1
0
29 Mar 2025
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Ruining Li
Chuanxia Zheng
Christian Rupprecht
Andrea Vedaldi
49
2
0
28 Mar 2025
On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations
Rajdeep Singh Hundal
Yan Xiao
Xiaochun Cao
Jin Song Dong
Manuel Rigger
76
0
0
28 Mar 2025
Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
Weizhen Wang
Jianping He
Xiaoming Duan
46
0
0
28 Mar 2025
Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning
Yongshuai Liu
Xin Liu
103
1
0
26 Mar 2025
One Framework to Rule Them All: Unifying RL-Based and RL-Free Methods in RLHF
Xin Cai
54
1
0
25 Mar 2025
Adventurer: Exploration with BiGAN for Deep Reinforcement Learning
Yongshuai Liu
Xin Liu
GAN
134
2
0
24 Mar 2025
Option Discovery Using LLM-guided Semantic Hierarchical Reinforcement Learning
Chak Lam Shek
Pratap Tokekar
56
0
0
24 Mar 2025
Evolutionary Policy Optimization
Jianren Wang
Yifan Su
Abhinav Gupta
Deepak Pathak
55
0
0
24 Mar 2025
AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models
Le Qiu
Zelai Xu
Qixin Tan
Wenhao Tang
Chao Yu
Yu Wang
AAML
71
0
0
24 Mar 2025
Optimizing Navigation And Chemical Application in Precision Agriculture With Deep Reinforcement Learning And Conditional Action Tree
Mahsa Khosravi
Zhanhong Jiang
Joshua R. Waite
Sarah Jonesc
Hernan Torres
Arti Singh
Baskar Ganapathysubramanian
Asheesh Kumar Singh
Soumik Sarkar
51
0
0
23 Mar 2025
RL4Med-DDPO: Reinforcement Learning for Controlled Guidance Towards Diverse Medical Image Generation using Vision-Language Foundation Models
Parham Saremi
Amar Kumar
Mohammed Mohammed
Zahra Tehraninasab
Tal Arbel
LM&MA
MedIm
53
1
0
20 Mar 2025
Active management of battery degradation in wireless sensor network using deep reinforcement learning for group battery replacement
Jong-Hyun Jeonga
Hongki Jo
Qiang Zhou
Tahsin Afroz Hoque Nishat
Lang Wu
45
1
0
20 Mar 2025
Reinforcement Learning-based Heuristics to Guide Domain-Independent Dynamic Programming
Minori Narita
Ryo Kuroiwa
J. Christopher Beck
60
0
0
20 Mar 2025
Predicting Multi-Agent Specialization via Task Parallelizability
Elizabeth Mieczkowski
Ruaridh Mon-Williams
Neil R. Bramley
Christopher G. Lucas
Natalia Vélez
Thomas Griffiths
56
1
0
19 Mar 2025
Synchronous vs Asynchronous Reinforcement Learning in a Real World Robot
Ali Parsaee
Fahim Shahriar
Chuxin He
Ruiqing Tan
OffRL
65
0
0
17 Mar 2025
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
Bo Liu
Yunxiang Li
Yangqiu Song
Hanjing Wang
Linyi Yang
...
Jun Wang
Weinan Zhang
Weinan Zhang
Shuyue Hu
Ying Wen
LLMAG
KELM
LRM
AI4CE
111
7
0
12 Mar 2025
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents
Tristan Tomilin
Meng Fang
Mykola Pechenizkiy
67
0
0
11 Mar 2025
Safe Explicable Policy Search
Akkamahadevi Hanni
Jonathan Montaño
Yu Zhang
69
0
0
10 Mar 2025
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Yuxiao Qu
Matthew Y. R. Yang
Amrith Rajagopal Setlur
Lewis Tunstall
E. Beeching
Ruslan Salakhutdinov
Aviral Kumar
OffRL
85
24
0
10 Mar 2025
Probabilistic Shielding for Safe Reinforcement Learning
Edwin Hamel-De le Court
Francesco Belardinelli
Alex W. Goodall
49
0
0
09 Mar 2025
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Hyungkyu Kang
Min-hwan Oh
OffRL
55
0
0
07 Mar 2025
Is Bellman Equation Enough for Learning Control?
Haoxiang You
Lekan Molu
Ian Abraham
75
0
0
04 Mar 2025
Overcoming Non-stationary Dynamics with Evidential Proximal Policy Optimization
Abdullah Akgul
Gulcin Baykal
Manuel Haußmann
M. Kandemir
63
0
0
03 Mar 2025
Differentiable Information Enhanced Model-Based Reinforcement Learning
Xiaoyuan Zhang
Xinyan Cai
Bo Liu
Weidong Huang
Song-Chun Zhu
Siyuan Qi
Y. Yang
58
0
0
03 Mar 2025
A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning
Shashank Gupta
Chaitanya Ahuja
Tsung-Yu Lin
Sreya Dutta Roy
Harrie Oosterhuis
Maarten de Rijke
Satya Narayan Shukla
61
1
0
02 Mar 2025
BodyGen: Advancing Towards Efficient Embodiment Co-Design
Haofei Lu
Zhe Wu
Junliang Xing
Jianshu Li
Ruoyu Li
Zhe Li
Yuanchun Shi
51
0
0
01 Mar 2025
Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies
Zhouyu He
Peng Qiao
Rongchun Li
Yong Dou
Yusong Tan
OffRL
93
0
0
27 Feb 2025
Unifying Model Predictive Path Integral Control, Reinforcement Learning, and Diffusion Models for Optimal Control and Planning
Yankai Li
Mo Chen
AI4CE
63
0
0
27 Feb 2025
Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
Jaehyeon Son
Soochan Lee
Gunhee Kim
OffRL
88
1
0
26 Feb 2025
Safe Multi-Agent Navigation guided by Goal-Conditioned Safe Reinforcement Learning
Meng Feng
Viraj Parimi
B. Williams
82
1
0
25 Feb 2025
Enhancing PPO with Trajectory-Aware Hybrid Policies
Qisai Liu
Zhanhong Jiang
Hsin-Jung Yang
Mahsa Khosravi
Joshua R. Waite
Soumik Sarkar
62
0
0
21 Feb 2025
RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution
Emmanuel K. Raptis
Athanasios Ch. Kapoutsis
Elias B. Kosmatopoulos
LM&Ro
96
0
0
18 Feb 2025
Reward-Safety Balance in Offline Safe RL via Diffusion Regularization
Junyu Guo
Zhi Zheng
Donghao Ying
Ming Jin
Shangding Gu
C. Spanos
Javad Lavaei
OffRL
94
0
0
18 Feb 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
93
0
0
11 Feb 2025
Previous
1
2
3
4
5
...
61
62
63
Next