ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.05477
  4. Cited By
Trust Region Policy Optimization

Trust Region Policy Optimization

19 February 2015
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
ArXivPDFHTML

Papers citing "Trust Region Policy Optimization"

50 / 3,103 papers shown
Title
Task Offloading in Vehicular Edge Computing using Deep Reinforcement Learning: A Survey
Task Offloading in Vehicular Edge Computing using Deep Reinforcement Learning: A Survey
Ashab Uddin
Ahmed Hamdi Sakr
Ning Zhang
OffRL
67
0
0
10 Feb 2025
Mirror Descent Actor Critic via Bounded Advantage Learning
Mirror Descent Actor Critic via Bounded Advantage Learning
Ryo Iwaki
101
0
0
06 Feb 2025
Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification
Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification
Rudolf Reiter
Jasper Hoffmann
D. Reinhardt
Florian Messerer
Katrin Baumgärtner
Shamburaj Sawant
Joschka Boedecker
Moritz Diehl
S. Gros
97
5
0
04 Feb 2025
Circular Microalgae-Based Carbon Control for Net Zero
Circular Microalgae-Based Carbon Control for Net Zero
Federico Zocco
Joan García
W. Haddad
134
0
0
04 Feb 2025
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
D. Yao
Wenpin Tang
77
0
0
03 Feb 2025
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods
Oussama Zekri
Nicolas Boullé
DiffM
85
3
0
03 Feb 2025
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
Haque Ishfaq
Guangyuan Wang
Sami Nur Islam
Doina Precup
83
2
0
29 Jan 2025
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Rémy Hosseinkhan Boucher
Onofrio Semeraro
L. Mathelin
97
0
0
28 Jan 2025
Benchmarking Model Predictive Control and Reinforcement Learning Based Control for Legged Robot Locomotion in MuJoCo Simulation
Benchmarking Model Predictive Control and Reinforcement Learning Based Control for Legged Robot Locomotion in MuJoCo Simulation
Shivayogi Akki
Tan Chen
53
0
0
28 Jan 2025
Towards General-Purpose Model-Free Reinforcement Learning
Scott Fujimoto
P. DÓro
Amy Zhang
Yuandong Tian
Michael Rabbat
OffRL
49
3
0
28 Jan 2025
Low-altitude Friendly-Jamming for Satellite-Maritime Communications via Generative AI-enabled Deep Reinforcement Learning
Jiawei Huang
Aimin Wang
Geng Sun
Jiahui Li
Jiacheng Wang
Dusit Niyato
Victor C. M. Leung
67
0
0
28 Jan 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
62
16
0
28 Jan 2025
ABPT: Amended Backpropagation through Time with Partially Differentiable Rewards
ABPT: Amended Backpropagation through Time with Partially Differentiable Rewards
Fanxing Li
Fangyu Sun
Tianbao Zhang
Danping Zou
46
0
0
24 Jan 2025
State Combinatorial Generalization In Decision Making With Conditional Diffusion Models
State Combinatorial Generalization In Decision Making With Conditional Diffusion Models
Xintong Duan
Yutong He
Fahim Tajwar
Wen-Tse Chen
Ruslan Salakhutdinov
Jeff Schneider
OffRL
AI4CE
109
0
0
22 Jan 2025
An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management
An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management
Eslam Eldeeb
Hirley Alves
OffRL
92
0
0
22 Jan 2025
HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation
HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation
Hazem Taha
Ameer M. S. Abdelhadi
52
1
0
22 Jan 2025
Stability Enhancement in Reinforcement Learning via Adaptive Control Lyapunov Function
Stability Enhancement in Reinforcement Learning via Adaptive Control Lyapunov Function
Donghe Chen
Han Wang
Lin Cheng
Shengping Gong
269
0
0
18 Jan 2025
Average-Reward Reinforcement Learning with Entropy Regularization
Average-Reward Reinforcement Learning with Entropy Regularization
Jacob Adamczyk
Volodymyr Makarenko
Stas Tiomkin
R. Kulkarni
OOD
66
2
0
17 Jan 2025
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Chaoqi Wang
Zhuokai Zhao
Yibo Jiang
Zhaorun Chen
Chen Zhu
...
Jiayi Liu
Lizhu Zhang
Xiangjun Fan
Hao Ma
Sinong Wang
98
4
0
16 Jan 2025
CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving
Bhargava Uppuluri
Anjel Patel
Neil Mehta
Sridhar Kamath
Pratyush Chakraborty
62
0
0
10 Jan 2025
CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic
CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic
Huaiyuan Yao
Longchao Da
Vishnu Nandam
Justin Turnau
Zhiwei Liu
Linsey Pang
Hua Wei
LLMAG
75
6
0
10 Jan 2025
Improving GenIR Systems Based on User Feedback
Qingyao Ai
Zhicheng Dou
Min Zhang
277
0
0
06 Jan 2025
Learn A Flexible Exploration Model for Parameterized Action Markov Decision Processes
Zijian Wang
Bin Wang
Mingwen Shao
Hongbo Dou
Boxiang Tao
62
0
0
06 Jan 2025
RFPPO: Motion Dynamic RRT based Fluid Field - PPO for Dynamic TF/TA Routing Planning
RFPPO: Motion Dynamic RRT based Fluid Field - PPO for Dynamic TF/TA Routing Planning
Rongkun Xue
Jing Yang
Yuyang Jiang
Yiming Feng
Zi Yang
44
0
0
31 Dec 2024
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy
Keru Chen
Honghao Wei
Zhigang Deng
Sen Lin
OffRL
OnRL
113
0
0
31 Dec 2024
Graph-attention-based Casual Discovery with Trust Region-navigated
  Clipping Policy Optimization
Graph-attention-based Casual Discovery with Trust Region-navigated Clipping Policy Optimization
Shixuan Liu
Yanghe Feng
Keyu Wu
Guangquan Cheng
Jincai Huang
Zhong Liu
CML
72
7
0
27 Dec 2024
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo
  Cancellation
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
Fei Zhao
Xueliang Zhang
48
0
0
25 Dec 2024
SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC
SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC
Yue Deng
Yan Yu
Weiyu Ma
Zirui Wang
Wenhui Zhu
Jian Zhao
Yin Zhang
AAML
50
1
0
23 Dec 2024
Optimizing Low-Speed Autonomous Driving: A Reinforcement Learning
  Approach to Route Stability and Maximum Speed
Optimizing Low-Speed Autonomous Driving: A Reinforcement Learning Approach to Route Stability and Maximum Speed
Benny Bao-Sheng Li
Elena Wu
Hins Shao-Xuan Yang
Nicky Yao-Jin Liang
80
0
0
20 Dec 2024
Practicable Black-box Evasion Attacks on Link Prediction in Dynamic
  Graphs -- A Graph Sequential Embedding Method
Practicable Black-box Evasion Attacks on Link Prediction in Dynamic Graphs -- A Graph Sequential Embedding Method
Jiate Li
Meng Pang
Binghui Wang
AAML
84
1
0
17 Dec 2024
Achieving Collective Welfare in Multi-Agent Reinforcement Learning via
  Suggestion Sharing
Achieving Collective Welfare in Multi-Agent Reinforcement Learning via Suggestion Sharing
Yue Jin
Shuangqing Wei
Giovanni Montana
99
0
0
16 Dec 2024
RL-LLM-DT: An Automatic Decision Tree Generation Method Based on RL
  Evaluation and LLM Enhancement
RL-LLM-DT: An Automatic Decision Tree Generation Method Based on RL Evaluation and LLM Enhancement
Junjie Lin
Jian Zhao
Lin Liu
Yue Deng
Youpeng Zhao
Lanxiao Huang
Xia Lin
Wengang Zhou
Haoyang Li
89
0
0
16 Dec 2024
Safe Reinforcement Learning using Finite-Horizon Gradient-based
  Estimation
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
Juntao Dai
Yaodong Yang
Qian Zheng
Gang Pan
OffRL
99
2
0
15 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
125
5
0
10 Dec 2024
Reinforcement Learning Enhanced LLMs: A Survey
Reinforcement Learning Enhanced LLMs: A Survey
Shuhe Wang
Shengyu Zhang
Jing Zhang
Runyi Hu
Xiaoya Li
Tianwei Zhang
Jiwei Li
Fei Wu
G. Wang
Eduard H. Hovy
OffRL
141
9
0
05 Dec 2024
Conformal Symplectic Optimization for Stable Reinforcement Learning
Conformal Symplectic Optimization for Stable Reinforcement Learning
Yao Lyu
Xiangteng Zhang
Shengbo Eben Li
Jingliang Duan
Letian Tao
Qing Xu
Lei He
Keqiang Li
83
0
0
03 Dec 2024
Application of Soft Actor-Critic Algorithms in Optimizing Wastewater
  Treatment with Time Delays Integration
Application of Soft Actor-Critic Algorithms in Optimizing Wastewater Treatment with Time Delays Integration
Esmaeel Mohammadi
D. O. Arroyo
A. A. Hansen
Mikkel Stokholm-Bjerregaard
S. Gros
Akhil S. Anand
Petar Durdevic
78
0
0
27 Nov 2024
Accelerating Proximal Policy Optimization Learning Using Task Prediction
  for Solving Environments with Delayed Rewards
Accelerating Proximal Policy Optimization Learning Using Task Prediction for Solving Environments with Delayed Rewards
A. Ahmad
Mehdi Kermanshah
Kevin J. Leahy
Zachary Serlin
H. Siu
Makai Mann
C. Vasile
Roberto Tron
C. Belta
OffRL
79
0
0
26 Nov 2024
Creating Hierarchical Dispositions of Needs in an Agent
Creating Hierarchical Dispositions of Needs in an Agent
Tofara Moyo
100
0
0
23 Nov 2024
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
A. Jain
Harley Wiltzer
Jesse Farebrother
Irina Rish
Glen Berseth
Sanjiban Choudhury
70
1
0
11 Nov 2024
Optimal Execution with Reinforcement Learning
Optimal Execution with Reinforcement Learning
Yadh Hafsi
Edoardo Vittori
33
0
0
10 Nov 2024
Acceleration for Deep Reinforcement Learning using Parallel and
  Distributed Computing: A Survey
Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey
Zhihong Liu
Xin Xu
Peng Qiao
Dongsheng Li
OffRL
43
3
0
08 Nov 2024
Structure Matters: Dynamic Policy Gradient
Structure Matters: Dynamic Policy Gradient
Sara Klein
Xiangyuan Zhang
Tamer Basar
Simon Weissmann
Leif Döring
44
0
0
07 Nov 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
71
4
0
07 Nov 2024
A Comparative Study of Deep Reinforcement Learning for Crop Production
  Management
A Comparative Study of Deep Reinforcement Learning for Crop Production Management
Joseph Balderas
Dong Chen
Yanbo Huang
Li Wang
Ren-Cang Li
OffRL
36
0
0
06 Nov 2024
From Novice to Expert: LLM Agent Policy Optimization via Step-wise
  Reinforcement Learning
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning
Zhirui Deng
Zhicheng Dou
Yinlin Zhu
Ji-Rong Wen
Ruibin Xiong
Mang Wang
Xin Wu
54
7
0
06 Nov 2024
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation
LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation
Bowen Li
Zhaoyu Li
Qiwei Du
Jinqi Luo
Wenshan Wang
...
Katia Sycara
Pradeep Kumar Ravikumar
Alexander G. Gray
X. Si
Sebastian A. Scherer
AI4CE
LRM
91
3
0
01 Nov 2024
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task
  Alignment
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
Weichao Zhou
Wenchao Li
53
0
0
31 Oct 2024
Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode
Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode
Philipp Gassert
Matthias Althoff
47
0
0
30 Oct 2024
PrefPaint: Aligning Image Inpainting Diffusion Model with Human
  Preference
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
Kendong Liu
Zhiyu Zhu
Chuanhao Li
Hui Liu
H. Zeng
Junhui Hou
EGVM
51
2
0
29 Oct 2024
Previous
123456...616263
Next