ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,422 papers shown
Multi-objective Deep Reinforcement Learning for Mobile Edge Computing
Multi-objective Deep Reinforcement Learning for Mobile Edge ComputingInternational Symposium on Modeling and Optimization in Mobile, Ad-Hoc and Wireless Networks (WiOpt), 2023
Ning Yang
Junrui Wen
Mengya Zhang
Mingqi Tang
179
14
0
05 Jul 2023
Citation: A Key to Building Responsible and Accountable Large Language
  Models
Citation: A Key to Building Responsible and Accountable Large Language Models
Jie Huang
Kevin Chen-Chuan Chang
HILM
326
28
0
05 Jul 2023
Generative Job Recommendations with Large Language Model
Generative Job Recommendations with Large Language Model
Zhi Zheng
Zhaopeng Qiu
Xiao Hu
Likang Wu
Hengshu Zhu
Hui Xiong
131
32
0
05 Jul 2023
Becoming self-instruct: introducing early stopping criteria for minimal
  instruct tuning
Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning
Waseem Alshikh
Manhal Daaboul
K. Goddard
Brock Imel
Kiran Kamble
Parikshit Kulkarni
M. Russak
ALM
36
15
0
05 Jul 2023
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of
  Circular Cylinder with Sparse Surface Pressure Sensing
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure SensingJournal of Fluid Mechanics (JFM), 2023
Qiulei Wang
Lei Yan
Gang Hu
Wenli Chen
Jean Rabault
B. R. Noack
AI4CE
240
45
0
05 Jul 2023
Hierarchical Planning and Policy Shaping Shared Autonomy for Articulated
  Robots
Hierarchical Planning and Policy Shaping Shared Autonomy for Articulated Robots
E. Yousefi
Mo Chen
I. Sharf
110
2
0
04 Jul 2023
Physics-based Motion Retargeting from Sparse Inputs
Physics-based Motion Retargeting from Sparse InputsProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2023
Daniele Reda
Jungdam Won
Yuting Ye
M. van de Panne
Alexander Winkler
VGen
162
18
0
04 Jul 2023
Emergent Resource Exchange and Tolerated Theft Behavior using
  Multi-Agent Reinforcement Learning
Emergent Resource Exchange and Tolerated Theft Behavior using Multi-Agent Reinforcement LearningArtificial Life (AL), 2023
Jack Garbus
Jordan Pollack
130
0
0
04 Jul 2023
RaidEnv: Exploring New Challenges in Automated Content Balancing for
  Boss Raid Games
RaidEnv: Exploring New Challenges in Automated Content Balancing for Boss Raid GamesIEEE Transactions on Games (IEEE Trans. Games), 2023
Hyeonchang Jeon
In-Chang Baek
Cheong-mok Bae
Taehwa Park
Wonsang You
Taegwan Ha
Hoyun Jung
Jinha Noh
Seungwon Oh
Kyung-Joong Kim
247
15
0
04 Jul 2023
Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement
  Learning
Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning
Ini Oguntola
Joseph Campbell
Simon Stepputtis
Katia Sycara
219
16
0
03 Jul 2023
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained
  Transformer
BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer
Z. Li
Shitou Zhang
Hai Zhao
Yifei Yang
Dongjie Yang
LM&MA
271
27
0
01 Jul 2023
Thompson sampling for improved exploration in GFlowNets
Thompson sampling for improved exploration in GFlowNets
Jarrid Rector-Brooks
Kanika Madan
Moksh Jain
Maksym Korablyov
Cheng-Hao Liu
Sarath Chandar
Nikolay Malkin
Yoshua Bengio
214
32
0
30 Jun 2023
Design of Induction Machines using Reinforcement Learning
Design of Induction Machines using Reinforcement Learning
Yasmin SarcheshmehPour
Tommi Ryyppo
Victor Mukherjee
A. Jung
AI4CE
43
0
0
30 Jun 2023
Navigation of micro-robot swarms for targeted delivery using
  reinforcement learning
Navigation of micro-robot swarms for targeted delivery using reinforcement learning
Akshatha Jagadish
M. Varma
175
0
0
30 Jun 2023
Preference Ranking Optimization for Human Alignment
Preference Ranking Optimization for Human AlignmentAAAI Conference on Artificial Intelligence (AAAI), 2023
Feifan Song
Yu Bowen
Minghao Li
Haiyang Yu
Fei Huang
Yongbin Li
Houfeng Wang
ALM
254
334
0
30 Jun 2023
Landmark Guided Active Exploration with State-specific Balance
  Coefficient
Landmark Guided Active Exploration with State-specific Balance Coefficient
Fei Cui
Jiaojiao Fang
Mengke Yang
Guizhong Liu
169
0
0
30 Jun 2023
Human-like Decision-making at Unsignalized Intersection using Social
  Value Orientation
Human-like Decision-making at Unsignalized Intersection using Social Value Orientation
Yan Tong
Licheng Wen
Pinlong Cai
Daocheng Fu
Song Mao
Yikang Li
219
2
0
30 Jun 2023
Decentralized Motor Skill Learning for Complex Robotic Systems
Decentralized Motor Skill Learning for Complex Robotic SystemsIEEE Robotics and Automation Letters (RA-L), 2023
Yanjiang Guo
Zheyuan Jiang
Yen-Jen Wang
Jingyue Gao
Jianyu Chen
122
9
0
30 Jun 2023
RObotic MAnipulation Network (ROMAN) -- Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
RObotic MAnipulation Network (ROMAN) -- Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
Eleftherios Triantafyllidis
Fernando Acero
Zhaocheng Liu
Zhibin Li
414
0
0
30 Jun 2023
Probabilistic Constraint for Safety-Critical Reinforcement Learning
Probabilistic Constraint for Safety-Critical Reinforcement LearningIEEE Transactions on Automatic Control (TAC), 2023
Weiqin Chen
D. Subramanian
Santiago Paternain
267
24
0
29 Jun 2023
Learning Environment Models with Continuous Stochastic Dynamics
Learning Environment Models with Continuous Stochastic Dynamics
Martin Tappler
Edi Muškardin
B. Aichernig
Bettina Könighofer
AI4CE
159
1
0
29 Jun 2023
Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A
  Dual Optimization Perspective
Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization PerspectiveACM Transactions on Knowledge Discovery from Data (TKDD), 2023
Meng Xiao
Dongjie Wang
Min-Ying Wu
Kunpeng Liu
Hui Xiong
Yuanchun Zhou
Yanjie Fu
170
30
0
29 Jun 2023
Policy Space Diversity for Non-Transitive Games
Policy Space Diversity for Non-Transitive GamesNeural Information Processing Systems (NeurIPS), 2023
Jian Yao
Weiming Liu
Haobo Fu
Yaodong Yang
Alexander Shmakov
Qiang Fu
Wei Yang
294
20
0
29 Jun 2023
Principles and Guidelines for Evaluating Social Robot Navigation
  Algorithms
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms
Anthony G. Francis
Claudia Pérez-DÁrpino
Chengshu Li
Fei Xia
Alexandre Alahi
...
Xuesu Xiao
Peng Xu
Naoki Yokoyama
Alexander Toshev
Roberto Martin-Martin Logical Robotics
351
137
0
29 Jun 2023
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand
  Cores
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand CoresInternational Conference on Learning Representations (ICLR), 2023
Zhiyu Mei
Wei Fu
Jiaxuan Gao
Guang Wang
Huanchen Zhang
Yi Wu
OffRLLRM
406
8
0
29 Jun 2023
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
Federico Berto
Chuanbo Hua
Minjun Kim
Laurin Luttmann
Yining Ma
...
Guojie Song
Changhyun Kwon
Kevin Tierney
Lin Xie
Jinkyoo Park
OffRL
517
29
0
29 Jun 2023
SARC: Soft Actor Retrospective Critic
SARC: Soft Actor Retrospective Critic
Sukriti Verma
Ayush Chopra
J. Subramanian
Mausoom Sarkar
Nikaash Puri
Piyush B. Gupta
Balaji Krishnamurthy
154
0
0
28 Jun 2023
Learning Continuous Control with Geometric Regularity from Robot
  Intrinsic Symmetry
Learning Continuous Control with Geometric Regularity from Robot Intrinsic SymmetryIEEE International Conference on Robotics and Automation (ICRA), 2023
Shengchao Yan
Baohe Zhang
Yuan Zhang
Joschka Boedecker
Wolfram Burgard
297
6
0
28 Jun 2023
Towards a Better Understanding of Learning with Multiagent Teams
Towards a Better Understanding of Learning with Multiagent TeamsInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
David Radke
Kate Larson
Timothy B. Brecht
Kyle Tilbury
LLMAG
221
3
0
28 Jun 2023
Query Understanding in the Age of Large Language Models
Query Understanding in the Age of Large Language Models
Avishek Anand
Venktesh V
Abhijit Anand
Vinay Setty
LRM
259
9
0
28 Jun 2023
Action and Trajectory Planning for Urban Autonomous Driving with
  Hierarchical Reinforcement Learning
Action and Trajectory Planning for Urban Autonomous Driving with Hierarchical Reinforcement Learning
Xinyang Lu
Flint Xiaofeng Fan
Tianying Wang
169
11
0
28 Jun 2023
RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$
RL3^33: Boosting Meta Reinforcement Learning via RL inside RL2^22
Abhinav Bhatia
Samer B. Nashed
S. Zilberstein
OffRL
370
0
0
28 Jun 2023
Diversity is Strength: Mastering Football Full Game with Interactive
  Reinforcement Learning of Multiple AIs
Diversity is Strength: Mastering Football Full Game with Interactive Reinforcement Learning of Multiple AIs
Chenglu Sun
Shuo Shen
Sijia Xu
Weidong Zhang
151
1
0
28 Jun 2023
A Population-Level Analysis of Neural Dynamics in Robust Legged Robots
A Population-Level Analysis of Neural Dynamics in Robust Legged Robots
Eugene R. Rush
Christoffer Heckman
Kaushik Jayaram
J. Humbert
174
0
0
27 Jun 2023
Rethinking Closed-loop Training for Autonomous Driving
Rethinking Closed-loop Training for Autonomous DrivingEuropean Conference on Computer Vision (ECCV), 2023
Chris Zhang
R. Guo
Wenyuan Zeng
Yuwen Xiong
Binbin Dai
Rui Hu
Mengye Ren
R. Urtasun
OffRL
280
36
0
27 Jun 2023
IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human
  Supervisors
IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human SupervisorsConference on Robot Learning (CoRL), 2023
Gaurav Datta
Ryan Hoque
Anrui Gu
Eugen Solowjow
Ken Goldberg
234
5
0
27 Jun 2023
Learning non-Markovian Decision-Making from State-only Sequences
Learning non-Markovian Decision-Making from State-only SequencesNeural Information Processing Systems (NeurIPS), 2023
Aoyang Qin
Feng Gao
Qing Li
Song-Chun Zhu
Sirui Xie
304
12
0
27 Jun 2023
RVT: Robotic View Transformer for 3D Object Manipulation
RVT: Robotic View Transformer for 3D Object ManipulationConference on Robot Learning (CoRL), 2023
Ankit Goyal
Jie Xu
Yijie Guo
Valts Blukis
Yu-Wei Chao
Dieter Fox
LM&Ro
339
223
0
26 Jun 2023
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Supervised Pretraining Can Learn In-Context Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Jonathan Lee
Annie Xie
Aldo Pacchiano
Yash Chandak
Chelsea Finn
Ofir Nachum
Emma Brunskill
OffRL
319
120
0
26 Jun 2023
ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots
ANYmal Parkour: Learning Agile Navigation for Quadrupedal RobotsScience Robotics (Sci. Robot.), 2023
David Hoeller
Nikita Rudin
Dhionis V. Sako
Marco Hutter
281
285
0
26 Jun 2023
ChiPFormer: Transferable Chip Placement via Offline Decision Transformer
ChiPFormer: Transferable Chip Placement via Offline Decision TransformerInternational Conference on Machine Learning (ICML), 2023
Yao Lai
Jinxin Liu
Zhentao Tang
Sijin Yu
Jianye Hao
Ping Luo
OffRL
192
60
0
26 Jun 2023
Augmenting Control over Exploration Space in Molecular Dynamics
  Simulators to Streamline De Novo Analysis through Generative Control Policies
Augmenting Control over Exploration Space in Molecular Dynamics Simulators to Streamline De Novo Analysis through Generative Control Policies
Paloma Gonzalez-Rojas
Andrew Emmel
L. Martínez
Neil Malur
G. Rutledge
AI4CE
172
0
0
26 Jun 2023
Estimating player completion rate in mobile puzzle games using
  reinforcement learning
Estimating player completion rate in mobile puzzle games using reinforcement learning
J. Kristensen
Arturo Valdivia
Paolo Burelli
110
15
0
26 Jun 2023
A Framework for dynamically meeting performance objectives on a service
  mesh
A Framework for dynamically meeting performance objectives on a service meshIEEE Transactions on Network and Service Management (TNSM), 2023
Forough Shahab Samani
Rolf Stadler
147
3
0
25 Jun 2023
Provably Convergent Policy Optimization via Metric-aware Trust Region
  Methods
Provably Convergent Policy Optimization via Metric-aware Trust Region Methods
Jun Song
Niao He
Lijun Ding
Chaoyue Zhao
220
4
0
25 Jun 2023
Safety-Critical Scenario Generation Via Reinforcement Learning Based
  Editing
Safety-Critical Scenario Generation Via Reinforcement Learning Based EditingIEEE International Conference on Robotics and Automation (ICRA), 2023
Haolan Liu
Liangjun Zhang
S. Hari
Jishen Zhao
317
16
0
25 Jun 2023
Towards Optimal Pricing of Demand Response -- A Nonparametric
  Constrained Policy Optimization Approach
Towards Optimal Pricing of Demand Response -- A Nonparametric Constrained Policy Optimization ApproachIEEE Power & Energy Society General Meeting (PESGM), 2023
Jun Song
Chaoyue Zhao
OffRL
65
1
0
24 Jun 2023
Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning
  with Encoder-Decoder Model using Action Query
Action Q-Transformer: Visual Explanation in Deep Reinforcement Learning with Encoder-Decoder Model using Action Query
Hidenori Itaya
Tsubasa Hirakawa
Takayoshi Yamashita
H. Fujiyoshi
K. Sugiura
OffRL
150
1
0
24 Jun 2023
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning
  Environments for Goal-Oriented Tasks
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented TasksNeural Information Processing Systems (NeurIPS), 2023
Maxime Chevalier-Boisvert
Bolun Dai
Mark Towers
Rodrigo de Lazcano
Lucas Willems
Salem Lahlou
Suman Pal
Pablo Samuel Castro
Jordan Terry
VGen
356
308
0
24 Jun 2023
Maintaining Plasticity in Deep Continual Learning
Maintaining Plasticity in Deep Continual Learning
Shibhansh Dohare
J. F. Hernandez-Garcia
Parash Rahman
A. Rupam Mahmood
Richard S. Sutton
KELMCLL
421
36
0
23 Jun 2023
Previous
123...134135136...227228229
Next
Page 135 of 229
Pageof 229