ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,422 papers shown
Learning Robust, Agile, Natural Legged Locomotion Skills in the Wild
Learning Robust, Agile, Natural Legged Locomotion Skills in the Wild
Yikai Wang
Zheyuan Jiang
Jianyu Chen
198
9
0
21 Apr 2023
Learning Semantic-Agnostic and Spatial-Aware Representation for
  Generalizable Visual-Audio Navigation
Learning Semantic-Agnostic and Spatial-Aware Representation for Generalizable Visual-Audio NavigationIEEE Robotics and Automation Letters (RA-L), 2023
Hongchen Wang
Yuxuan Wang
Fangwei Zhong
Min-Yu Wu
Jianwei Zhang
Yizhou Wang
Hao Dong
390
10
0
21 Apr 2023
DEIR: Efficient and Robust Exploration through
  Discriminative-Model-Based Episodic Intrinsic Rewards
DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic RewardsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Shanchuan Wan
Yujin Tang
Yingtao Tian
Tomoyuki Kaneko
OffRL
137
7
0
21 Apr 2023
TempoRL: laser pulse temporal shape optimization with Deep Reinforcement
  Learning
TempoRL: laser pulse temporal shape optimization with Deep Reinforcement Learning
F. Capuano
D. Peceli
Gabriele Tiboni
Raffaello Camoriano
Bedvrich Rus
58
2
0
20 Apr 2023
Interpretability for Conditional Coordinated Behavior in Multi-Agent
  Reinforcement Learning
Interpretability for Conditional Coordinated Behavior in Multi-Agent Reinforcement LearningIEEE International Joint Conference on Neural Network (IJCNN), 2023
Yoshinari Motokawa
T. Sugawara
AI4CE
110
3
0
20 Apr 2023
Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential
  Decision-Making in Multi-Agent Reinforcement Learning
Inducing Stackelberg Equilibrium through Spatio-Temporal Sequential Decision-Making in Multi-Agent Reinforcement LearningInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Bin Zhang
Lijuan Li
Zhiwei Xu
Dapeng Li
Guoliang Fan
202
13
0
20 Apr 2023
Neurosymbolic Models for Computer Graphics
Neurosymbolic Models for Computer Graphics
Daniel E. Ritchie
Paul Guerrero
R. K. Jones
Niloy J. Mitra
Adriana Schulz
Karl D. D. Willis
Jiajun Wu
3DV
215
37
0
20 Apr 2023
Aiding reinforcement learning for set point control
Aiding reinforcement learning for set point controlIFAC-PapersOnLine (IFAC-PapersOnLine), 2023
Ruoqing Zhang
Per Mattsson
T. Wigren
172
3
0
20 Apr 2023
Robust nonlinear set-point control with reinforcement learning
Robust nonlinear set-point control with reinforcement learningAmerican Control Conference (ACC), 2023
Ruoqing Zhang
Per Mattsson
T. Wigren
OOD
132
2
0
20 Apr 2023
Observer-Feedback-Feedforward Controller Structures in Reinforcement
  Learning
Observer-Feedback-Feedforward Controller Structures in Reinforcement LearningIFAC-PapersOnLine (IFAC-PapersOnLine), 2023
Ruoqing Zhang
Per Mattsson
T. Wigren
155
1
0
20 Apr 2023
SocialLight: Distributed Cooperation Learning towards Network-Wide
  Traffic Signal Control
SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal ControlAdaptive Agents and Multi-Agent Systems (AAMAS), 2023
Harsh Goel
Yifeng Zhang
Mehul Damani
Guillaume Sartoretti
137
11
0
20 Apr 2023
Mastering Asymmetrical Multiplayer Game with Multi-Agent
  Asymmetric-Evolution Reinforcement Learning
Mastering Asymmetrical Multiplayer Game with Multi-Agent Asymmetric-Evolution Reinforcement Learning
Chenglu Sun
Yi-cui Zhang
Yu Zhang
Ziling Lu
Jingbin Liu
Si-Qi Xu
Weidong Zhang
119
0
0
20 Apr 2023
Topological Guided Actor-Critic Modular Learning of Continuous Systems
  with Temporal Objectives
Topological Guided Actor-Critic Modular Learning of Continuous Systems with Temporal Objectives
Lening Li
Zhentian Qian
216
0
0
20 Apr 2023
Robust Route Planning with Distributional Reinforcement Learning in a
  Stochastic Road Network Environment
Robust Route Planning with Distributional Reinforcement Learning in a Stochastic Road Network Environment
Xi Lin
Paul Szenher
John D. Martin
Brendan Englot
164
2
0
19 Apr 2023
Learning policies for resource allocation in business processes
Learning policies for resource allocation in business processesInformation Systems (Inf. Syst.), 2023
J. Middelhuis
R. Bianco
E. Scherzer
Z. A. Bukhsh
I. Adan
R. Dijkman
130
12
0
19 Apr 2023
Bridging RL Theory and Practice with the Effective Horizon
Bridging RL Theory and Practice with the Effective HorizonNeural Information Processing Systems (NeurIPS), 2023
Cassidy Laidlaw
Stuart J. Russell
Anca Dragan
OffRL
271
37
0
19 Apr 2023
Heterogeneous-Agent Reinforcement Learning
Heterogeneous-Agent Reinforcement Learning
Yifan Zhong
J. Kuba
Xidong Feng
Siyi Hu
Jiaming Ji
Yaodong Yang
213
103
0
19 Apr 2023
H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman
  Problem
H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman ProblemAAAI Conference on Artificial Intelligence (AAAI), 2023
Xuanhao Pan
Yan Jin
Yuandong Ding
Ming Feng
Li Zhao
Lei Song
Jiang Bian
210
72
0
19 Apr 2023
Using Offline Data to Speed-up Reinforcement Learning in Procedurally
  Generated Environments
Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated EnvironmentsNeurocomputing (Neurocomputing), 2023
Alain Andres
Lukas Schafer
Esther Villar-Rodriguez
Stefano V. Albrecht
Javier Del Ser
OffRLOnRL
179
7
0
18 Apr 2023
Cooperative Multi-Agent Reinforcement Learning for Inventory Management
Cooperative Multi-Agent Reinforcement Learning for Inventory Management
Madhav Khirwar
Karthik S. Gurumoorthy
Ankit Jain
Shantala Manchenahally
154
5
0
18 Apr 2023
A study on a Q-Learning algorithm application to a manufacturing
  assembly problem
A study on a Q-Learning algorithm application to a manufacturing assembly problemJournal of manufacturing systems (JMS), 2021
M. Neves
Miguel Vieira
Pedro Neto
OffRL
56
33
0
17 Apr 2023
Tool Learning with Foundation Models
Tool Learning with Foundation ModelsACM Computing Surveys (ACM Comput. Surv.), 2023
Yujia Qin
Shengding Hu
Yankai Lin
Weize Chen
Ning Ding
...
Cheng Yang
Tongshuang Wu
Heng Ji
Zhiyuan Liu
Maosong Sun
389
315
0
17 Apr 2023
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization
Agustinus Kristiadi
Alexander Immer
Runa Eschenhagen
Vincent Fortuin
BDLUQCV
250
11
0
17 Apr 2023
Training Automated Defense Strategies Using Graph-based Cyber Attack
  Simulations
Training Automated Defense Strategies Using Graph-based Cyber Attack Simulations
Jakob Nyberg
Pontus Johnson
AAML
141
4
0
17 Apr 2023
STAS: Spatial-Temporal Return Decomposition for Multi-agent
  Reinforcement Learning
STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning
Sirui Chen
Zhaowei Zhang
Yaodong Yang
Yali Du
247
6
0
15 Apr 2023
Learning To Optimize Quantum Neural Network Without Gradients
Learning To Optimize Quantum Neural Network Without GradientsInternational Conference on Quantum Computing and Engineering (QCE), 2023
Ankit Kulshrestha
Xiaoyuan Liu
Hayato Ushijima-Mwesigwa
Ilya Safro
158
8
0
15 Apr 2023
Learning to Learn Group Alignment: A Self-Tuning Credo Framework with
  Multiagent Teams
Learning to Learn Group Alignment: A Self-Tuning Credo Framework with Multiagent Teams
David Radke
Kyle Tilbury
130
1
0
14 Apr 2023
OpenAssistant Conversations -- Democratizing Large Language Model
  Alignment
OpenAssistant Conversations -- Democratizing Large Language Model AlignmentNeural Information Processing Systems (NeurIPS), 2023
Andreas Kopf
Yannic Kilcher
Dimitri von Rutte
Sotiris Anagnostidis
Zhi Rui Tam
...
Arnav Dantuluri
Andrew Maguire
Christoph Schuhmann
Huu Nguyen
A. Mattick
ALMLM&MA
793
786
0
14 Apr 2023
Learning Perceptive Bipedal Locomotion over Irregular Terrain
Learning Perceptive Bipedal Locomotion over Irregular Terrain
B. V. Marum
M. Sabatelli
Hamidreza Kasaei
147
4
0
14 Apr 2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Hanze Dong
Wei Xiong
Deepanshu Goyal
Yihan Zhang
Winnie Chow
Boyao Wang
Shizhe Diao
Jipeng Zhang
Kashun Shum
Tong Zhang
ALM
463
636
0
13 Apr 2023
Language Instructed Reinforcement Learning for Human-AI Coordination
Language Instructed Reinforcement Learning for Human-AI CoordinationInternational Conference on Machine Learning (ICML), 2023
Hengyuan Hu
Dorsa Sadigh
LM&Ro
270
83
0
13 Apr 2023
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image
  Generation
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image GenerationNeural Information Processing Systems (NeurIPS), 2023
Jiazheng Xu
Xiao Liu
Yuchen Wu
Yuxuan Tong
Qinkai Li
Ming Ding
Jie Tang
Yuxiao Dong
559
736
0
12 Apr 2023
Learning to Communicate and Collaborate in a Competitive Multi-Agent
  Setup to Clean the Ocean from Macroplastics
Learning to Communicate and Collaborate in a Competitive Multi-Agent Setup to Clean the Ocean from Macroplastics
P. D. Siedler
AI4CE
145
0
0
12 Apr 2023
Facilitating Sim-to-real by Intrinsic Stochasticity of Real-Time
  Simulation in Reinforcement Learning for Robot Manipulation
Facilitating Sim-to-real by Intrinsic Stochasticity of Real-Time Simulation in Reinforcement Learning for Robot ManipulationIEEE Transactions on Artificial Intelligence (IEEE TAI), 2023
Ram Dershan
Amir M. Soufi Enayati
Zengjie Zhang
D. Richert
Homayoun Najjaran
222
4
0
12 Apr 2023
Multi-agent Policy Reciprocity with Theoretical Guarantee
Multi-agent Policy Reciprocity with Theoretical Guarantee
Haozhi Wang
Yinchuan Li
Qing Wang
Yunfeng Shao
Jianye Hao
198
1
0
12 Apr 2023
Sample-Efficient Reinforcement Learning with Symmetry-Guided Demonstrations for Robotic Manipulation
Sample-Efficient Reinforcement Learning with Symmetry-Guided Demonstrations for Robotic Manipulation
Amir M. Soufi Enayati
Zengjie Zhang
Kashish Gupta
Homayoun Najjaran
OffRL
191
0
0
12 Apr 2023
Frontier Semantic Exploration for Visual Target Navigation
Frontier Semantic Exploration for Visual Target NavigationIEEE International Conference on Robotics and Automation (ICRA), 2023
Bangguo Yu
Hamidreza Kasaei
M. Cao
277
25
0
11 Apr 2023
L3MVN: Leveraging Large Language Models for Visual Target Navigation
L3MVN: Leveraging Large Language Models for Visual Target NavigationIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Bangguo Yu
Hamidreza Kasaei
M. Cao
LM&Ro
267
166
0
11 Apr 2023
RRHF: Rank Responses to Align Language Models with Human Feedback
  without tears
RRHF: Rank Responses to Align Language Models with Human Feedback without tearsNeural Information Processing Systems (NeurIPS), 2023
Zheng Yuan
Hongyi Yuan
Chuanqi Tan
Wei Wang
Songfang Huang
Feiran Huang
ALM
427
481
0
11 Apr 2023
Feudal Graph Reinforcement Learning
Feudal Graph Reinforcement Learning
Tommaso Marzi
Arshjot Khehra
Andrea Cini
Cesare Alippi
423
2
0
11 Apr 2023
Optimal Interpretability-Performance Trade-off of Classification Trees
  with Black-Box Reinforcement Learning
Optimal Interpretability-Performance Trade-off of Classification Trees with Black-Box Reinforcement Learning
Hector Kohler
R. Akrour
Philippe Preux
OffRL
190
0
0
11 Apr 2023
Reinforcement Learning Tutor Better Supported Lower Performers in a Math
  Task
Reinforcement Learning Tutor Better Supported Lower Performers in a Math TaskMachine-mediated learning (ML), 2023
S. Ruan
Allen Nie
William Steenbergen
Jiayu He
JQ Zhang
...
Kyle Dang Nguyen
Catherine Y Wang
Rui Ying
James A. Landay
Emma Brunskill
184
29
0
11 Apr 2023
Real-Time Model-Free Deep Reinforcement Learning for Force Control of a
  Series Elastic Actuator
Real-Time Model-Free Deep Reinforcement Learning for Force Control of a Series Elastic ActuatorIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Ruturaj Sambhus
Aydin Gokce
Stephen Welch
Connor W. Herron
Alexander Leonessa
138
1
0
11 Apr 2023
Learning a Universal Human Prior for Dexterous Manipulation from Human
  Preference
Learning a Universal Human Prior for Dexterous Manipulation from Human Preference
Zihan Ding
Yuanpei Chen
Allen Z. Ren
S. Gu
Qianxu Wang
Hao Dong
Chi Jin
220
10
0
10 Apr 2023
OpenAGI: When LLM Meets Domain Experts
OpenAGI: When LLM Meets Domain ExpertsNeural Information Processing Systems (NeurIPS), 2023
Yingqiang Ge
Qingfeng Lan
Kai Mei
Jianchao Ji
Juntao Tan
Shuyuan Xu
Zelong Li
Zelong Li
VLMLRM
317
308
0
10 Apr 2023
Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control
  of PTZ Cameras
Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ CamerasInternational Conference on Internet-of-Things Design and Implementation (IoTDI), 2023
S. Sandha
Bharathan Balaji
L. Garcia
Mani B. Srivastava
119
8
0
10 Apr 2023
Evolving Reinforcement Learning Environment to Minimize Learner's
  Achievable Reward: An Application on Hardening Active Directory Systems
Evolving Reinforcement Learning Environment to Minimize Learner's Achievable Reward: An Application on Hardening Active Directory SystemsAnnual Conference on Genetic and Evolutionary Computation (GECCO), 2023
Diksha Goel
Aneta Neumann
Frank Neumann
Hung Nguyen
Mingyu Guo
AAML
129
11
0
08 Apr 2023
Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic Embedding
Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic EmbeddingIEEE Conference on Decision and Control (CDC), 2023
Zhaolin Ren
Tongzheng Ren
Haitong Ma
Na Li
Bo Dai
315
12
0
08 Apr 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language
  Models
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language ModelsFirst Monday (FM), 2023
Emilio Ferrara
SILM
460
339
0
07 Apr 2023
A Policy for Early Sequence Classification
A Policy for Early Sequence ClassificationInternational Conference on Artificial Neural Networks (ICANN), 2023
Alexander Cao
J. Utke
Diego Klabjan
107
2
0
07 Apr 2023
Previous
123...142143144...227228229
Next