ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 3,103 papers shown
Title
Human-Readable Programs as Actors of Reinforcement Learning Agents Using
  Critic-Moderated Evolution
Human-Readable Programs as Actors of Reinforcement Learning Agents Using Critic-Moderated Evolution
Senne Deproost
Denis Steckelmacher
Ann Nowé
51
0
0
29 Oct 2024
Dual-Agent Deep Reinforcement Learning for Dynamic Pricing and
  Replenishment
Dual-Agent Deep Reinforcement Learning for Dynamic Pricing and Replenishment
Yi Zheng
Zehao Li
Peng Jiang
Yijie Peng
29
0
0
28 Oct 2024
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image
  Generative Models
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Weijian Luo
C. Zhang
Debing Zhang
Zhengyang Geng
35
4
0
28 Oct 2024
Adversarial Constrained Policy Optimization: Improving Constrained
  Reinforcement Learning by Adapting Budgets
Adversarial Constrained Policy Optimization: Improving Constrained Reinforcement Learning by Adapting Budgets
Jianmina Ma
Jingtian Ji
Yue Gao
36
0
0
28 Oct 2024
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
Shengyi Huang
Sophie Xhonneux
Arian Hosseini
Rishabh Agarwal
Rameswar Panda
OffRL
91
6
0
23 Oct 2024
Benchmarking Smoothness and Reducing High-Frequency Oscillations in
  Continuous Control Policies
Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies
Guilherme Christmann
Ying-Sheng Luo
Hanjaya Mandala
Wei-Chao Chen
40
0
0
22 Oct 2024
Online Reinforcement Learning with Passive Memory
Online Reinforcement Learning with Passive Memory
Anay Pattanaik
Lav R. Varshney
CLL
OffRL
33
0
0
18 Oct 2024
Streaming Deep Reinforcement Learning Finally Works
Streaming Deep Reinforcement Learning Finally Works
Mohamed Elsayed
Gautham Vasan
A. R. Mahmood
OffRL
60
4
0
18 Oct 2024
Knowledge Transfer from Simple to Complex: A Safe and Efficient
  Reinforcement Learning Framework for Autonomous Driving Decision-Making
Knowledge Transfer from Simple to Complex: A Safe and Efficient Reinforcement Learning Framework for Autonomous Driving Decision-Making
Rongliang Zhou
Jiakun Huang
Mingjun Li
Hepeng Li
Haotian Cao
Xiaolin Song
34
0
0
18 Oct 2024
Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive
  Approach
Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach
Riccardo Poiani
Nicole Nobili
Alberto Maria Metelli
Marcello Restelli
34
1
0
17 Oct 2024
Mitigating Suboptimality of Deterministic Policy Gradients in Complex
  Q-functions
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Ayush Jain
Norio Kosaka
Xinhu Li
Kyung-Min Kim
Erdem Bıyık
Joseph J. Lim
OffRL
26
0
0
15 Oct 2024
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained
  Policies
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained Policies
Zixuan Chen
Xialin He
Yen-Jen Wang
Qiayuan Liao
Yanjie Ze
...
S. Sastry
Jiajun Wu
Koushil Sreenath
Saurabh Gupta
Xue Bin Peng
71
17
0
15 Oct 2024
Exploiting Risk-Aversion and Size-dependent fees in FX Trading with
  Fitted Natural Actor-Critic
Exploiting Risk-Aversion and Size-dependent fees in FX Trading with Fitted Natural Actor-Critic
Vito Alessandro Monaco
Antonio Riva
Luca Sabbioni
L. Bisi
Edoardo Vittori
Marco Pinciroli
Michele Trapletti
Marcello Restelli
19
0
0
15 Oct 2024
Learning Agents With Prioritization and Parameter Noise in Continuous
  State and Action Space
Learning Agents With Prioritization and Parameter Noise in Continuous State and Action Space
Rajesh Mangannavar
Gopalakrishnan Srinivasaraghavan
25
2
0
15 Oct 2024
Improving the Language Understanding Capabilities of Large Language
  Models Using Reinforcement Learning
Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
Bokai Hu
Sai Ashish Somayajula
Xin Pan
Zihan Huang
Pengtao Xie
OffRL
26
1
0
14 Oct 2024
Reinforcement Learning For Quadrupedal Locomotion: Current Advancements
  And Future Perspectives
Reinforcement Learning For Quadrupedal Locomotion: Current Advancements And Future Perspectives
Maurya Gurram
Prakash Kumar Uttam
Shantipal S. Ohol
OffRL
60
0
0
14 Oct 2024
Meta-Reinforcement Learning with Universal Policy Adaptation: Provable
  Near-Optimality under All-task Optimum Comparator
Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator
Siyuan Xu
Minghui Zhu
OffRL
37
1
0
13 Oct 2024
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
Ge Li
Dong Tian
Hongyi Zhou
Xinkai Jiang
Rudolf Lioutikov
Gerhard Neumann
OffRL
318
3
0
12 Oct 2024
Multi-Agent Actor-Critics in Autonomous Cyber Defense
Multi-Agent Actor-Critics in Autonomous Cyber Defense
Mingjun Wang
Remington Dechene
36
0
0
11 Oct 2024
Can we hop in general? A discussion of benchmark selection and design
  using the Hopper environment
Can we hop in general? A discussion of benchmark selection and design using the Hopper environment
C. Voelcker
Marcel Hussing
Eric Eaton
OffRL
36
3
0
11 Oct 2024
Improved Sample Complexity for Global Convergence of Actor-Critic
  Algorithms
Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms
Navdeep Kumar
Priyank Agrawal
Giorgia Ramponi
Kfir Y. Levy
Shie Mannor
48
0
0
11 Oct 2024
E-Motion: Future Motion Simulation via Event Sequence Diffusion
E-Motion: Future Motion Simulation via Event Sequence Diffusion
Song Wu
Zhiyu Zhu
Junhui Hou
Guangming Shi
Jinjian Wu
DiffM
VGen
52
1
0
11 Oct 2024
Exploring Natural Language-Based Strategies for Efficient Number
  Learning in Children through Reinforcement Learning
Exploring Natural Language-Based Strategies for Efficient Number Learning in Children through Reinforcement Learning
Tirthankar Mittra
31
0
0
10 Oct 2024
Avoiding mode collapse in diffusion models fine-tuned with reinforcement
  learning
Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning
Roberto Barceló
Cristóbal Alcázar
Felipe Tobar
45
3
0
10 Oct 2024
Solving Multi-Goal Robotic Tasks with Decision Transformer
Solving Multi-Goal Robotic Tasks with Decision Transformer
Paul Gajewski
Dominik Zurek
Marcin Pietroñ
Kamil Faber
OffRL
37
1
0
08 Oct 2024
Learning in complex action spaces without policy gradients
Learning in complex action spaces without policy gradients
Arash Tavakoli
Sina Ghiassian
Nemanja Rakićević
OffRL
39
0
0
08 Oct 2024
Reinforcement Learning From Imperfect Corrective Actions And Proxy
  Rewards
Reinforcement Learning From Imperfect Corrective Actions And Proxy Rewards
Zhaohui Jiang
Xuening Feng
Paul Weng
Yifei Zhu
Yan Song
Tianze Zhou
Yujing Hu
Tangjie Lv
Changjie Fan
69
1
0
08 Oct 2024
Mastering Chinese Chess AI (Xiangqi) Without Search
Mastering Chinese Chess AI (Xiangqi) Without Search
Yu Chen
Juntong Lin
Zhichao Shu
19
0
0
07 Oct 2024
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
Marina Sheshukova
Denis Belomestny
Alain Durmus
Eric Moulines
Alexey Naumov
S. Samsonov
51
1
0
07 Oct 2024
Bisimulation metric for Model Predictive Control
Bisimulation metric for Model Predictive Control
Yutaka Shimizu
Masayoshi Tomizuka
50
0
0
06 Oct 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Zhaolin Gao
Wenhao Zhan
Jonathan D. Chang
Gokul Swamy
Kianté Brantley
Jason D. Lee
Wen Sun
OffRL
81
4
0
06 Oct 2024
GreenLight-Gym: Reinforcement learning benchmark environment for control of greenhouse production systems
GreenLight-Gym: Reinforcement learning benchmark environment for control of greenhouse production systems
Bart van Laatum
Eldert J. van Henten
Sjoerd Boersma
OffRL
79
0
0
06 Oct 2024
Abstract Reward Processes: Leveraging State Abstraction for Consistent
  Off-Policy Evaluation
Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation
Shreyas Chaudhari
Ameet Deshpande
Bruno Castro da Silva
Philip S. Thomas
OffRL
46
1
0
03 Oct 2024
C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front
C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front
Ruohong Liu
Yuxin Pan
Linjie Xu
Lei Song
Jiang Bian
Pengcheng You
Yize Chen
48
1
0
03 Oct 2024
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai
Haoran Sun
Huang Fang
Shuohuan Wang
Yu Sun
Hua Wu
294
1
0
03 Oct 2024
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit
  Assignment
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Amirhossein Kazemnejad
Milad Aghajohari
Eva Portelance
Alessandro Sordoni
Siva Reddy
Rameswar Panda
Nicolas Le Roux
OffRL
LRM
36
30
0
02 Oct 2024
Sampling from Energy-based Policies using Diffusion
Sampling from Energy-based Policies using Diffusion
V. Jain
Tara Akhound-Sadegh
Siamak Ravanbakhsh
DiffM
79
2
0
02 Oct 2024
Dual Approximation Policy Optimization
Dual Approximation Policy Optimization
Zhihan Xiong
Maryam Fazel
Lin Xiao
49
1
0
02 Oct 2024
Absolute State-wise Constrained Policy Optimization: High-Probability
  State-wise Constraints Satisfaction
Absolute State-wise Constrained Policy Optimization: High-Probability State-wise Constraints Satisfaction
Weiye Zhao
Feihan Li
Yifan Sun
Yujie Wang
Rui Chen
Tianhao Wei
Changliu Liu
35
0
0
02 Oct 2024
HybridFlow: A Flexible and Efficient RLHF Framework
HybridFlow: A Flexible and Efficient RLHF Framework
Guangming Sheng
Chi Zhang
Zilingfeng Ye
Xibin Wu
Wang Zhang
Ru Zhang
Size Zheng
Haibin Lin
Chuan Wu
AI4CE
62
127
0
28 Sep 2024
Double Actor-Critic with TD Error-Driven Regularization in Reinforcement
  Learning
Double Actor-Critic with TD Error-Driven Regularization in Reinforcement Learning
Haohui Chen
Zhiyong Chen
Aoxiang Liu
Wentuo Fang
OffRL
44
0
0
28 Sep 2024
Autoregressive Policy Optimization for Constrained Allocation Tasks
Autoregressive Policy Optimization for Constrained Allocation Tasks
David Winkel
Niklas Strauß
Maximilian Bernhard
Zongyue Li
Thomas Seidl
Matthias Schubert
41
0
0
27 Sep 2024
Revisiting inverse Hessian vector products for calculating influence
  functions
Revisiting inverse Hessian vector products for calculating influence functions
Yegor Klochkov
Yang Liu
TDI
LLMSV
53
1
0
25 Sep 2024
A Survey for Deep Reinforcement Learning Based Network Intrusion
  Detection
A Survey for Deep Reinforcement Learning Based Network Intrusion Detection
Wanrong Yang
Alberto Acuto
Yihang Zhou
Dominik Wojtczak
OffRL
56
3
0
25 Sep 2024
Stage-Wise Reward Shaping for Acrobatic Robots: A Constrained
  Multi-Objective Reinforcement Learning Approach
Stage-Wise Reward Shaping for Acrobatic Robots: A Constrained Multi-Objective Reinforcement Learning Approach
Dohyeong Kim
Hyeokjin Kwon
Junseok Kim
Gunmin Lee
Songhwai Oh
42
6
0
24 Sep 2024
R-AIF: Solving Sparse-Reward Robotic Tasks from Pixels with Active
  Inference and World Models
R-AIF: Solving Sparse-Reward Robotic Tasks from Pixels with Active Inference and World Models
Viet Dung Nguyen
Zhizhuo Yang
Christopher L. Buckley
Alexander Ororbia
56
3
0
21 Sep 2024
On-policy Actor-Critic Reinforcement Learning for Multi-UAV Exploration
On-policy Actor-Critic Reinforcement Learning for Multi-UAV Exploration
A. Farid
Jafar Roshanian
Malek Mouhoub
43
1
0
17 Sep 2024
Vision-driven UAV River Following: Benchmarking with Safe Reinforcement
  Learning
Vision-driven UAV River Following: Benchmarking with Safe Reinforcement Learning
Zihan Wang
N. Mahmoudian
47
2
0
13 Sep 2024
Design Optimization of Nuclear Fusion Reactor through Deep Reinforcement
  Learning
Design Optimization of Nuclear Fusion Reactor through Deep Reinforcement Learning
Jinsu Kim
J. Seo
AI4CE
16
0
0
12 Sep 2024
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation
Asen Nachkov
Danda Pani Paudel
Luc Van Gool
53
0
0
12 Sep 2024
Previous
12345...616263
Next