ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,401 papers shown
Title
Becoming self-instruct: introducing early stopping criteria for minimal
  instruct tuning
Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning
Waseem Alshikh
Manhal Daaboul
K. Goddard
Brock Imel
Kiran Kamble
Parikshit Kulkarni
M. Russak
ALM
35
15
0
05 Jul 2023
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of
  Circular Cylinder with Sparse Surface Pressure Sensing
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure SensingJournal of Fluid Mechanics (JFM), 2023
Qiulei Wang
Lei Yan
Gang Hu
Wenli Chen
Jean Rabault
B. R. Noack
AI4CE
212
43
0
05 Jul 2023
Hierarchical Planning and Policy Shaping Shared Autonomy for Articulated
  Robots
Hierarchical Planning and Policy Shaping Shared Autonomy for Articulated Robots
E. Yousefi
Mo Chen
I. Sharf
102
2
0
04 Jul 2023
Physics-based Motion Retargeting from Sparse Inputs
Physics-based Motion Retargeting from Sparse InputsProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2023
Daniele Reda
Jungdam Won
Yuting Ye
M. van de Panne
Alexander Winkler
VGen
157
18
0
04 Jul 2023
Emergent Resource Exchange and Tolerated Theft Behavior using
  Multi-Agent Reinforcement Learning
Emergent Resource Exchange and Tolerated Theft Behavior using Multi-Agent Reinforcement LearningArtificial Life (AL), 2023
Jack Garbus
Jordan Pollack
104
0
0
04 Jul 2023
RaidEnv: Exploring New Challenges in Automated Content Balancing for
  Boss Raid Games
RaidEnv: Exploring New Challenges in Automated Content Balancing for Boss Raid GamesIEEE Transactions on Games (IEEE Trans. Games), 2023
Hyeonchang Jeon
In-Chang Baek
Cheong-mok Bae
Taehwa Park
Wonsang You
Taegwan Ha
Hoyun Jung
Jinha Noh
Seungwon Oh
Kyung-Joong Kim
238
15
0
04 Jul 2023
Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement
  Learning
Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning
Ini Oguntola
Joseph Campbell
Simon Stepputtis
Katia Sycara
209
16
0
03 Jul 2023
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained
  Transformer
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer
Z. Li
Shitou Zhang
Hai Zhao
Yifei Yang
Dongjie Yang
LM&MA
245
26
0
01 Jul 2023
Thompson sampling for improved exploration in GFlowNets
Thompson sampling for improved exploration in GFlowNets
Jarrid Rector-Brooks
Kanika Madan
Moksh Jain
Maksym Korablyov
Cheng-Hao Liu
Sarath Chandar
Nikolay Malkin
Yoshua Bengio
179
32
0
30 Jun 2023
Design of Induction Machines using Reinforcement Learning
Design of Induction Machines using Reinforcement Learning
Yasmin SarcheshmehPour
Tommi Ryyppo
Victor Mukherjee
A. Jung
AI4CE
28
0
0
30 Jun 2023
Navigation of micro-robot swarms for targeted delivery using
  reinforcement learning
Navigation of micro-robot swarms for targeted delivery using reinforcement learning
Akshatha Jagadish
M. Varma
169
0
0
30 Jun 2023
Preference Ranking Optimization for Human Alignment
Preference Ranking Optimization for Human AlignmentAAAI Conference on Artificial Intelligence (AAAI), 2023
Feifan Song
Yu Bowen
Minghao Li
Haiyang Yu
Fei Huang
Yongbin Li
Houfeng Wang
ALM
224
330
0
30 Jun 2023
Landmark Guided Active Exploration with State-specific Balance
  Coefficient
Landmark Guided Active Exploration with State-specific Balance Coefficient
Fei Cui
Jiaojiao Fang
Mengke Yang
Guizhong Liu
147
0
0
30 Jun 2023
Human-like Decision-making at Unsignalized Intersection using Social
  Value Orientation
Human-like Decision-making at Unsignalized Intersection using Social Value Orientation
Yan Tong
Licheng Wen
Pinlong Cai
Daocheng Fu
Song Mao
Yikang Li
206
2
0
30 Jun 2023
Decentralized Motor Skill Learning for Complex Robotic Systems
Decentralized Motor Skill Learning for Complex Robotic SystemsIEEE Robotics and Automation Letters (RA-L), 2023
Yanjiang Guo
Zheyuan Jiang
Yen-Jen Wang
Jingyue Gao
Jianyu Chen
113
9
0
30 Jun 2023
RObotic MAnipulation Network (ROMAN) -- Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
RObotic MAnipulation Network (ROMAN) -- Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
Eleftherios Triantafyllidis
Fernando Acero
Zhaocheng Liu
Zhibin Li
359
0
0
30 Jun 2023
Probabilistic Constraint for Safety-Critical Reinforcement Learning
Probabilistic Constraint for Safety-Critical Reinforcement LearningIEEE Transactions on Automatic Control (TAC), 2023
Weiqin Chen
D. Subramanian
Santiago Paternain
247
21
0
29 Jun 2023
Learning Environment Models with Continuous Stochastic Dynamics
Learning Environment Models with Continuous Stochastic Dynamics
Martin Tappler
Edi Muškardin
B. Aichernig
Bettina Könighofer
AI4CE
151
1
0
29 Jun 2023
Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A
  Dual Optimization Perspective
Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization PerspectiveACM Transactions on Knowledge Discovery from Data (TKDD), 2023
Meng Xiao
Dongjie Wang
Min-Ying Wu
Kunpeng Liu
Hui Xiong
Yuanchun Zhou
Yanjie Fu
148
30
0
29 Jun 2023
Policy Space Diversity for Non-Transitive Games
Policy Space Diversity for Non-Transitive GamesNeural Information Processing Systems (NeurIPS), 2023
Jian Yao
Weiming Liu
Haobo Fu
Yaodong Yang
Alexander Shmakov
Qiang Fu
Wei Yang
285
19
0
29 Jun 2023
Principles and Guidelines for Evaluating Social Robot Navigation
  Algorithms
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms
Anthony G. Francis
Claudia Pérez-DÁrpino
Chengshu Li
Fei Xia
Alexandre Alahi
...
Xuesu Xiao
Peng Xu
Naoki Yokoyama
Alexander Toshev
Roberto Martin-Martin Logical Robotics
333
132
0
29 Jun 2023
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand
  Cores
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand CoresInternational Conference on Learning Representations (ICLR), 2023
Zhiyu Mei
Wei Fu
Jiaxuan Gao
Guang Wang
Huanchen Zhang
Yi Wu
OffRLLRM
368
8
0
29 Jun 2023
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
Federico Berto
Chuanbo Hua
Minjun Kim
Laurin Luttmann
Yining Ma
...
Guojie Song
Changhyun Kwon
Kevin Tierney
Lin Xie
Jinkyoo Park
OffRL
469
63
0
29 Jun 2023
SARC: Soft Actor Retrospective Critic
SARC: Soft Actor Retrospective Critic
Sukriti Verma
Ayush Chopra
J. Subramanian
Mausoom Sarkar
Nikaash Puri
Piyush B. Gupta
Balaji Krishnamurthy
142
0
0
28 Jun 2023
Learning Continuous Control with Geometric Regularity from Robot
  Intrinsic Symmetry
Learning Continuous Control with Geometric Regularity from Robot Intrinsic SymmetryIEEE International Conference on Robotics and Automation (ICRA), 2023
Shengchao Yan
Baohe Zhang
Yuan Zhang
Joschka Boedecker
Wolfram Burgard
288
5
0
28 Jun 2023
Towards a Better Understanding of Learning with Multiagent Teams
Towards a Better Understanding of Learning with Multiagent TeamsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
David Radke
Kate Larson
Timothy B. Brecht
Kyle Tilbury
LLMAG
197
3
0
28 Jun 2023
Query Understanding in the Age of Large Language Models
Query Understanding in the Age of Large Language Models
Avishek Anand
Venktesh V
Abhijit Anand
Vinay Setty
LRM
255
10
0
28 Jun 2023
Action and Trajectory Planning for Urban Autonomous Driving with
  Hierarchical Reinforcement Learning
Action and Trajectory Planning for Urban Autonomous Driving with Hierarchical Reinforcement Learning
Xinyang Lu
Flint Xiaofeng Fan
Tianying Wang
147
11
0
28 Jun 2023
RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$
RL3^33: Boosting Meta Reinforcement Learning via RL inside RL2^22
Abhinav Bhatia
Samer B. Nashed
S. Zilberstein
OffRL
351
0
0
28 Jun 2023
Diversity is Strength: Mastering Football Full Game with Interactive
  Reinforcement Learning of Multiple AIs
Diversity is Strength: Mastering Football Full Game with Interactive Reinforcement Learning of Multiple AIs
Chenglu Sun
Shuo Shen
Sijia Xu
Weidong Zhang
123
1
0
28 Jun 2023
A Population-Level Analysis of Neural Dynamics in Robust Legged Robots
A Population-Level Analysis of Neural Dynamics in Robust Legged Robots
Eugene R. Rush
Christoffer Heckman
Kaushik Jayaram
J. Humbert
164
0
0
27 Jun 2023
Rethinking Closed-loop Training for Autonomous Driving
Rethinking Closed-loop Training for Autonomous DrivingEuropean Conference on Computer Vision (ECCV), 2023
Chris Zhang
R. Guo
Wenyuan Zeng
Yuwen Xiong
Binbin Dai
Rui Hu
Mengye Ren
R. Urtasun
OffRL
266
36
0
27 Jun 2023
IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human
  Supervisors
IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human SupervisorsConference on Robot Learning (CoRL), 2023
Gaurav Datta
Ryan Hoque
Anrui Gu
Eugen Solowjow
Ken Goldberg
206
5
0
27 Jun 2023
Learning non-Markovian Decision-Making from State-only Sequences
Learning non-Markovian Decision-Making from State-only SequencesNeural Information Processing Systems (NeurIPS), 2023
Aoyang Qin
Feng Gao
Qing Li
Song-Chun Zhu
Sirui Xie
249
12
0
27 Jun 2023
RVT: Robotic View Transformer for 3D Object Manipulation
RVT: Robotic View Transformer for 3D Object ManipulationConference on Robot Learning (CoRL), 2023
Ankit Goyal
Jie Xu
Yijie Guo
Valts Blukis
Yu-Wei Chao
Dieter Fox
LM&Ro
335
221
0
26 Jun 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Supervised Pretraining Can Learn In-Context Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Jonathan Lee
Annie Xie
Aldo Pacchiano
Yash Chandak
Chelsea Finn
Ofir Nachum
Emma Brunskill
OffRL
308
118
0
26 Jun 2023
ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots
ANYmal Parkour: Learning Agile Navigation for Quadrupedal RobotsScience Robotics (Sci. Robot.), 2023
David Hoeller
Nikita Rudin
Dhionis V. Sako
Marco Hutter
253
271
0
26 Jun 2023
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer
ChiPFormer: Transferable Chip Placement via Offline Decision TransformerInternational Conference on Machine Learning (ICML), 2023
Yao Lai
Jinxin Liu
Zhentao Tang
Sijin Yu
Jianye Hao
Ping Luo
OffRL
184
60
0
26 Jun 2023
Augmenting Control over Exploration Space in Molecular Dynamics
  Simulators to Streamline De Novo Analysis through Generative Control Policies
Augmenting Control over Exploration Space in Molecular Dynamics Simulators to Streamline De Novo Analysis through Generative Control Policies
Paloma Gonzalez-Rojas
Andrew Emmel
L. Martínez
Neil Malur
G. Rutledge
AI4CE
159
0
0
26 Jun 2023
Estimating player completion rate in mobile puzzle games using
  reinforcement learning
Estimating player completion rate in mobile puzzle games using reinforcement learning
J. Kristensen
Arturo Valdivia
Paolo Burelli
109
15
0
26 Jun 2023
A Framework for dynamically meeting performance objectives on a service
  mesh
A Framework for dynamically meeting performance objectives on a service meshIEEE Transactions on Network and Service Management (TNSM), 2023
Forough Shahab Samani
Rolf Stadler
142
3
0
25 Jun 2023
Provably Convergent Policy Optimization via Metric-aware Trust Region
  Methods
Provably Convergent Policy Optimization via Metric-aware Trust Region Methods
Jun Song
Niao He
Lijun Ding
Chaoyue Zhao
211
4
0
25 Jun 2023
Safety-Critical Scenario Generation Via Reinforcement Learning Based
  Editing
Safety-Critical Scenario Generation Via Reinforcement Learning Based EditingIEEE International Conference on Robotics and Automation (ICRA), 2023
Haolan Liu
Liangjun Zhang
S. Hari
Jishen Zhao
300
15
0
25 Jun 2023
Towards Optimal Pricing of Demand Response -- A Nonparametric
  Constrained Policy Optimization Approach
Towards Optimal Pricing of Demand Response -- A Nonparametric Constrained Policy Optimization ApproachIEEE Power & Energy Society General Meeting (PESGM), 2023
Jun Song
Chaoyue Zhao
OffRL
64
1
0
24 Jun 2023
Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning
  with Encoder-Decoder Model using Action Query
Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query
Hidenori Itaya
Tsubasa Hirakawa
Takayoshi Yamashita
H. Fujiyoshi
K. Sugiura
OffRL
129
1
0
24 Jun 2023
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning
  Environments for Goal-Oriented Tasks
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented TasksNeural Information Processing Systems (NeurIPS), 2023
Maxime Chevalier-Boisvert
Bolun Dai
Mark Towers
Rodrigo de Lazcano
Lucas Willems
Salem Lahlou
Suman Pal
Pablo Samuel Castro
Jordan Terry
VGen
342
304
0
24 Jun 2023
Maintaining Plasticity in Deep Continual Learning
Maintaining Plasticity in Deep Continual Learning
Shibhansh Dohare
J. F. Hernandez-Garcia
Parash Rahman
A. Rupam Mahmood
Richard S. Sutton
KELMCLL
377
36
0
23 Jun 2023
Task-Driven Graph Attention for Hierarchical Relational Object
  Navigation
Task-Driven Graph Attention for Hierarchical Relational Object NavigationIEEE International Conference on Robotics and Automation (ICRA), 2023
Michael Lingelbach
Chengshu Li
Minjune Hwang
Andrey Kurenkov
Alan Lou
Roberto Martín-Martín
Ruohan Zhang
Li Fei-Fei
Jiajun Wu
224
10
0
23 Jun 2023
Creating Valid Adversarial Examples of Malware
Creating Valid Adversarial Examples of MalwareJournal of Computer Virology and Hacking Techniques (JCVHT), 2023
M. Kozák
M. Jureček
Mark Stamp
Fabio Di Troia
AAML
147
16
0
23 Jun 2023
Correcting discount-factor mismatch in on-policy policy gradient methods
Correcting discount-factor mismatch in on-policy policy gradient methodsInternational Conference on Machine Learning (ICML), 2023
Fengdi Che
Gautham Vasan
A. R. Mahmood
OffRL
118
9
0
23 Jun 2023
Previous
123...134135136...227228229
Next