ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,421 papers shown
Low-Switching Policy Gradient with Exploration via Online Sensitivity
  Sampling
Low-Switching Policy Gradient with Exploration via Online Sensitivity SamplingInternational Conference on Machine Learning (ICML), 2023
Yunfan Li
Yiran Wang
Y. Cheng
Lin F. Yang
OffRL
210
6
0
15 Jun 2023
Hierarchical Planning and Control for Box Loco-Manipulation
Hierarchical Planning and Control for Box Loco-ManipulationProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2023
Zhaoming Xie
Jo-Han Tseng
Sebastian Starke
M. van de Panne
Chenxi Liu
274
41
0
15 Jun 2023
Recurrent Action Transformer with Memory
Recurrent Action Transformer with Memory
A. Staroverov
A. Bessonov
Dmitry A. Yudin
A. Kovalev
Aleksandr I. Panov
OffRL
393
13
0
15 Jun 2023
Inroads into Autonomous Network Defence using Explained Reinforcement
  Learning
Inroads into Autonomous Network Defence using Explained Reinforcement Learning
Myles Foley
Miaowei Wang
M. Zoe
Chris Hicks
V. Mavroudis
AAML
257
22
0
15 Jun 2023
Semantic HELM: A Human-Readable Memory for Reinforcement Learning
Semantic HELM: A Human-Readable Memory for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Fabian Paischer
Thomas Adler
M. Hofmarcher
Sepp Hochreiter
300
18
0
15 Jun 2023
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large
  Language Models
Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Myles Foley
Ambrish Rawat
Taesung Lee
Yufang Hou
Gabriele Picco
Giulio Zizzo
DeLMO
358
6
0
15 Jun 2023
Datasets and Benchmarks for Offline Safe Reinforcement Learning
Datasets and Benchmarks for Offline Safe Reinforcement Learning
Zuxin Liu
Zijian Guo
Haohong Lin
Yi-Fan Yao
Jiacheng Zhu
...
Hanjiang Hu
Wenhao Yu
Tingnan Zhang
Jie Tan
Ding Zhao
OffRL
303
54
0
15 Jun 2023
Generalizable Resource Scaling of 5G Slices using Constrained
  Reinforcement Learning
Generalizable Resource Scaling of 5G Slices using Constrained Reinforcement LearningIEEE/IFIP Network Operations and Management Symposium (NOMS), 2023
Muhammad Sulaiman
Mahdieh Ahmadi
M. A. Salahuddin
R. Boutaba
A. Saleh
156
10
0
15 Jun 2023
Optimal Exploration for Model-Based RL in Nonlinear Systems
Optimal Exploration for Model-Based RL in Nonlinear SystemsNeural Information Processing Systems (NeurIPS), 2023
Andrew Wagenmaker
Guanya Shi
Kevin Jamieson
257
22
0
15 Jun 2023
Predictive Maneuver Planning with Deep Reinforcement Learning (PMP-DRL)
  for comfortable and safe autonomous driving
Predictive Maneuver Planning with Deep Reinforcement Learning (PMP-DRL) for comfortable and safe autonomous driving
Jayabrata Chowdhury
Vishruth Veerendranath
Suresh Sundaram
N. Sundararajan
117
0
0
15 Jun 2023
Towards Benchmarking and Improving the Temporal Reasoning Capability of
  Large Language Models
Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Qingyu Tan
Hwee Tou Ng
Lidong Bing
LRM
346
45
0
15 Jun 2023
ArchGym: An Open-Source Gymnasium for Machine Learning Assisted
  Architecture Design
ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture DesignInternational Symposium on Computer Architecture (ISCA), 2023
Srivatsan Krishnan
Amir Yazdanbaksh
Yasmine Omri
Jason J. Jabbour
Ikechukwu Uchendu
...
Behzad Boroujerdian
Daniel Richins
Devashree Tripathy
Aleksandra Faust
Vijay Janapa Reddi
242
17
0
15 Jun 2023
Deep Generative Models for Decision-Making and Control
Deep Generative Models for Decision-Making and Control
Michael Janner
292
3
0
15 Jun 2023
DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control
DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control
M. Malmir
Josip Josifovski
Noah Klarmann
Alois C. Knoll
349
4
0
15 Jun 2023
Integrating machine learning paradigms and mixed-integer model
  predictive control for irrigation scheduling
Integrating machine learning paradigms and mixed-integer model predictive control for irrigation scheduling
B. T. Agyeman
Mohamed Naouri
W. Appels
Jinfeng Liu
Sirish L. Shah
98
10
0
14 Jun 2023
OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments
OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments
Quentin Delfosse
Johannes Czech
Bjarne Gregori
Sebastian Sztwiertnia
Kristian Kersting
498
25
0
14 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
254
9
0
14 Jun 2023
Hierarchical Task Network Planning for Facilitating Cooperative
  Multi-Agent Reinforcement Learning
Hierarchical Task Network Planning for Facilitating Cooperative Multi-Agent Reinforcement Learning
Xuechen Mu
H. Zhuo
Chong Chen
Kai Zhang
Chao Yu
Jianye Hao
226
2
0
14 Jun 2023
A reinforcement learning strategy for p-adaptation in high order solvers
A reinforcement learning strategy for p-adaptation in high order solversResults in Engineering (RE), 2023
D. Huergo
G. Rubio
E. Ferrer
AI4CE
120
7
0
14 Jun 2023
MiniLLM: Knowledge Distillation of Large Language Models
MiniLLM: Knowledge Distillation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Yuxian Gu
Li Dong
Furu Wei
Shiyu Huang
ALM
640
94
0
14 Jun 2023
Multi-market Energy Optimization with Renewables via Reinforcement
  Learning
Multi-market Energy Optimization with Renewables via Reinforcement Learning
Lucien Werner
Peeyush Kumar
79
7
0
13 Jun 2023
AutoML in the Age of Large Language Models: Current Challenges, Future
  Opportunities and Risks
AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks
Alexander Tornede
Difan Deng
Theresa Eimer
Joseph Giovanelli
Aditya Mohan
...
Sarah Segel
Daphne Theodorakopoulos
Tanja Tornede
Henning Wachsmuth
Marius Lindauer
325
36
0
13 Jun 2023
Can ChatGPT Enable ITS? The Case of Mixed Traffic Control via
  Reinforcement Learning
Can ChatGPT Enable ITS? The Case of Mixed Traffic Control via Reinforcement Learning
Michael Villarreal
Bibek Poudel
Weizi Li
212
34
0
13 Jun 2023
Stepsize Learning for Policy Gradient Methods in Contextual Markov
  Decision Processes
Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes
Luca Sabbioni
Francesco Corda
Marcello Restelli
183
0
0
13 Jun 2023
Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field
  Solution
Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution
Dengyu Zhang
Guo-Niu Zhu
Qingrui Zhang
203
2
0
13 Jun 2023
SayTap: Language to Quadrupedal Locomotion
SayTap: Language to Quadrupedal LocomotionConference on Robot Learning (CoRL), 2023
Yujin Tang
Wenhao Yu
Jie Tan
Heiga Zen
Aleksandra Faust
Tatsuya Harada
304
59
0
13 Jun 2023
DenseLight: Efficient Control for Large-scale Traffic Signals with Dense
  Feedback
DenseLight: Efficient Control for Large-scale Traffic Signals with Dense FeedbackInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Junfan Lin
Yuying Zhu
Lingbo Liu
Yang Liu
Guanbin Li
Guanbin Li
144
14
0
13 Jun 2023
Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at
  100k Steps-Per-Second
Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-SecondComputer Vision and Pattern Recognition (CVPR), 2023
Vincent-Pierre Berges
Andrew Szot
Devendra Singh Chaplot
Aaron Gokaslan
Roozbeh Mottaghi
Dhruv Batra
Eric Undersander
LRMLM&Ro
246
7
0
13 Jun 2023
Unified Off-Policy Learning to Rank: a Reinforcement Learning
  Perspective
Unified Off-Policy Learning to Rank: a Reinforcement Learning PerspectiveNeural Information Processing Systems (NeurIPS), 2023
Zeyu Zhang
Yi-Hsun Su
Hui Yuan
Yiran Wu
R. Balasubramanian
Qingyun Wu
Huazheng Wang
Mengdi Wang
OffRLCML
382
7
0
13 Jun 2023
Robust Reinforcement Learning through Efficient Adversarial Herding
Robust Reinforcement Learning through Efficient Adversarial Herding
Juncheng Dong
Hao-Lun Hsu
Qitong Gao
Vahid Tarokh
Miroslav Pajic
165
4
0
12 Jun 2023
Online Prototype Alignment for Few-shot Policy Transfer
Online Prototype Alignment for Few-shot Policy TransferInternational Conference on Machine Learning (ICML), 2023
Qi Yi
Rui Zhang
Shaohui Peng
Jiaming Guo
Yunkai Gao
...
Xingui Hu
Zidong Du
Xishan Zhang
Qi Guo
Yunji Chen
OffRL
233
5
0
12 Jun 2023
Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic
  Specifications
Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications
Jiangwei Wang
Shuo Yang
Ziyan An
Songyang Han
Zhili Zhang
Rahul Mangharam
Meiyi Ma
Fei Miao
262
11
0
11 Jun 2023
Zero-Shot Wireless Indoor Navigation through Physics-Informed
  Reinforcement Learning
Zero-Shot Wireless Indoor Navigation through Physics-Informed Reinforcement LearningIEEE Open Journal of the Communications Society (JOCS), 2023
Mingsheng Yin
Tao Li
Haozhe Lei
Yaqi Hu
S. Rangan
Quanyan Zhu
249
4
0
11 Jun 2023
CoTran: An LLM-based Code Translator using Reinforcement Learning with
  Feedback from Compiler and Symbolic Execution
CoTran: An LLM-based Code Translator using Reinforcement Learning with Feedback from Compiler and Symbolic ExecutionEuropean Conference on Artificial Intelligence (ECAI), 2023
Prithwish Jana
Piyush Jha
Haoyang Ju
Gautham Kishore
Aryan Mahajan
Vijay Ganesh
363
32
0
11 Jun 2023
Reinforcement Learning with Parameterized Manipulation Primitives for
  Robotic Assembly
Reinforcement Learning with Parameterized Manipulation Primitives for Robotic Assembly
N. Vuong
Quang Pham
158
2
0
11 Jun 2023
Contact Reduction with Bounded Stiffness for Robust Sim-to-Real Transfer
  of Robot Assembly
Contact Reduction with Bounded Stiffness for Robust Sim-to-Real Transfer of Robot AssemblyIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
N. Vuong
Quang Pham
148
4
0
11 Jun 2023
A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement
  Learning with Provable Convergence
A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence
Kexuan Wang
An Liu
Baishuo Liu
166
1
0
10 Jun 2023
Long-term Microscopic Traffic Simulation with History-Masked Multi-agent
  Imitation Learning
Long-term Microscopic Traffic Simulation with History-Masked Multi-agent Imitation Learning
Ke Guo
Wei Jing
Lingping Gao
Weiwei Liu
Weizi Li
Jia Pan
AI4CE
143
5
0
10 Jun 2023
How to Learn and Generalize From Three Minutes of Data:
  Physics-Constrained and Uncertainty-Aware Neural Stochastic Differential
  Equations
How to Learn and Generalize From Three Minutes of Data: Physics-Constrained and Uncertainty-Aware Neural Stochastic Differential EquationsConference on Robot Learning (CoRL), 2023
Franck Djeumou
Cyrus Neary
Ufuk Topcu
DiffM
275
14
0
10 Jun 2023
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed
  Multi-Agent Reinforcement Learning
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning
Xiyang Wu
Rohan Chandra
Tianrui Guan
Amrit Singh Bedi
Tianyi Zhou
364
5
0
09 Jun 2023
Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based
  Reasoning in Partially Observable Environments
Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments
Jonathon Schwartz
H. Kurniawati
Marcus Hutter
OffRLLRM
171
1
0
09 Jun 2023
Approximate information state based convergence analysis of recurrent
  Q-learning
Approximate information state based convergence analysis of recurrent Q-learning
Erfan Seyedsalehi
N. Akbarzadeh
Amit Sinha
Aditya Mahajan
188
6
0
09 Jun 2023
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating
  The Worst Kernel
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst KernelInternational Conference on Machine Learning (ICML), 2023
Kaixin Wang
Uri Gadot
Navdeep Kumar
Kfir Y. Levy
Shie Mannor
297
8
0
09 Jun 2023
An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling
  Problems Based on Constraint Programming
An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling Problems Based on Constraint ProgrammingInternational Conference on Automated Planning and Scheduling (ICAPS), 2023
Pierre Tassel
Martin Gebser
Konstantin Schekotihin
103
26
0
09 Jun 2023
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum
  Markov Games: Switching System Approach
Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach
Dong-hwan Lee
248
3
0
09 Jun 2023
QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse
  Sensors
QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse SensorsInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Sunmin Lee
Sebastian Starke
Yuting Ye
Jungdam Won
Alexander Winkler
209
44
0
09 Jun 2023
Robustness Testing for Multi-Agent Reinforcement Learning: State
  Perturbations on Critical Agents
Robustness Testing for Multi-Agent Reinforcement Learning: State Perturbations on Critical AgentsEuropean Conference on Artificial Intelligence (ECAI), 2023
Ziyuan Zhou
Guanjun Liu
AAML
160
15
0
09 Jun 2023
A newborn embodied Turing test for view-invariant object recognition
A newborn embodied Turing test for view-invariant object recognitionAnnual Meeting of the Cognitive Science Society (CogSci), 2023
Denizhan Pak
Donsuk Lee
Samantha M. W. Wood
Justin N. Wood
LM&Ro
142
8
0
08 Jun 2023
ChatGPT is fun, but it is not funny! Humor is still challenging Large
  Language Models
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language ModelsWorkshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2023
Sophie F. Jentzsch
Kristian Kersting
LRM
154
47
0
07 Jun 2023
Long-form analogies generated by chatGPT lack human-like
  psycholinguistic properties
Long-form analogies generated by chatGPT lack human-like psycholinguistic propertiesAnnual Meeting of the Cognitive Science Society (CogSci), 2023
S. M. Seals
V. Shalin
169
18
0
07 Jun 2023
Previous
123...136137138...227228229
Next