ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,421 papers shown
Efficient Learning of Urban Driving Policies Using Bird's-Eye-View State
  Representations
Efficient Learning of Urban Driving Policies Using Bird's-Eye-View State Representations
Raphael Trumpp
M. Büchner
Abhinav Valada
Marco Caccamo
181
6
0
31 May 2023
Dynamic Neighborhood Construction for Structured Large Discrete Action
  Spaces
Dynamic Neighborhood Construction for Structured Large Discrete Action SpacesInternational Conference on Learning Representations (ICLR), 2023
F. Akkerman
Julius Luy
W. V. Heeswijk
Maximilian Schiffer
332
3
0
31 May 2023
Adaptive and Explainable Deployment of Navigation Skills via
  Hierarchical Deep Reinforcement Learning
Adaptive and Explainable Deployment of Navigation Skills via Hierarchical Deep Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2023
Kyowoon Lee
Seongun Kim
Jaesik Choi
224
19
0
31 May 2023
Symmetry-Aware Robot Design with Structured Subgroups
Symmetry-Aware Robot Design with Structured SubgroupsInternational Conference on Machine Learning (ICML), 2023
Heng Dong
Junyu Zhang
Tonghan Wang
Chongjie Zhang
183
18
0
31 May 2023
On the Linear Convergence of Policy Gradient under Hadamard
  Parameterization
On the Linear Convergence of Policy Gradient under Hadamard ParameterizationInformation and Inference A Journal of the IMA (JIII), 2023
Jiacai Liu
Jinchi Chen
Ke Wei
213
4
0
31 May 2023
Representation-Driven Reinforcement Learning
Representation-Driven Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Ofir Nabati
Guy Tennenholtz
Shie Mannor
291
2
0
31 May 2023
NetHack is Hard to Hack
NetHack is Hard to HackNeural Information Processing Systems (NeurIPS), 2023
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
259
9
0
30 May 2023
Subequivariant Graph Reinforcement Learning in 3D Environments
Subequivariant Graph Reinforcement Learning in 3D EnvironmentsInternational Conference on Machine Learning (ICML), 2023
Runfa Chen
Jiaqi Han
Gang Hua
Wen-bing Huang
OffRL
156
13
0
30 May 2023
Improving the performance of Learned Controllers in Behavior Trees using
  Value Function Estimates at Switching Boundaries
Improving the performance of Learned Controllers in Behavior Trees using Value Function Estimates at Switching BoundariesIEEE Robotics and Automation Letters (RA-L), 2023
Mart Kartasev
Petter Ögren
81
3
0
30 May 2023
Policy Optimization for Continuous Reinforcement Learning
Policy Optimization for Continuous Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Hanyang Zhao
Wenpin Tang
D. Yao
OffRL
351
30
0
30 May 2023
Perimeter Control Using Deep Reinforcement Learning: A Model-free
  Approach towards Homogeneous Flow Rate Optimization
Perimeter Control Using Deep Reinforcement Learning: A Model-free Approach towards Homogeneous Flow Rate Optimization
Xiaocan Li
Ray Coden Mercurius
Ayal Taitler
Xiaoyu Wang
Mohammad Noaeen
Scott Sanner
Baher Abdulhai
26
0
0
29 May 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward ModelNeural Information Processing Systems (NeurIPS), 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
883
6,769
0
29 May 2023
RLAD: Reinforcement Learning from Pixels for Autonomous Driving in Urban
  Environments
RLAD: Reinforcement Learning from Pixels for Autonomous Driving in Urban EnvironmentsIEEE Transactions on Automation Science and Engineering (IEEE TASE), 2023
Daniel Coelho
Miguel Oliveira
Vítor M. F. Santos
122
14
0
29 May 2023
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic AlgorithmInternational Conference on Machine Learning (ICML), 2023
Yunhao Tang
Tadashi Kozuno
Mark Rowland
Anna Harutyunyan
Rémi Munos
Bernardo Avila-Pires
Michal Valko
149
0
0
29 May 2023
Chatbots to ChatGPT in a Cybersecurity Space: Evolution,
  Vulnerabilities, Attacks, Challenges, and Future Recommendations
Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations
Attia Qammar
Hongmei Wang
Jianguo Ding
Abdenacer Naouri
M. Daneshmand
Huansheng Ning
SILM
167
25
0
29 May 2023
A Hybrid Framework of Reinforcement Learning and Convex Optimization for
  UAV-Based Autonomous Metaverse Data Collection
A Hybrid Framework of Reinforcement Learning and Convex Optimization for UAV-Based Autonomous Metaverse Data CollectionIEEE Network (IEEE Netw.), 2023
Peiyuan Si
Liangxin Qian
Jun Zhao
Kwok-Yan Lam
47
7
0
29 May 2023
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in
  Vision-Language Models
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
VLM
314
39
0
29 May 2023
Continual Task Allocation in Meta-Policy Network via Sparse Prompting
Continual Task Allocation in Meta-Policy Network via Sparse PromptingInternational Conference on Machine Learning (ICML), 2023
Yijun Yang
Tianyi Zhou
Jing Jiang
Guodong Long
Yuhui Shi
CLLOffRL
306
13
0
29 May 2023
Toward Fine Contact Interactions: Learning to Control Normal Contact
  Force with Limited Information
Toward Fine Contact Interactions: Learning to Control Normal Contact Force with Limited InformationIEEE International Conference on Robotics and Automation (ICRA), 2023
Jinda Cui
Jiawei Xu
David Saldaña
J. Trinkle
126
2
0
29 May 2023
RL + Model-based Control: Using On-demand Optimal Control to Learn
  Versatile Legged Locomotion
RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged LocomotionIEEE Robotics and Automation Letters (RA-L), 2023
Dong-oh Kang
Jin Cheng
Miguel Zamora
Fatemeh Zargarbashi
Stelian Coros
OffRL
268
51
0
29 May 2023
Interpretable Reward Redistribution in Reinforcement Learning: A Causal
  Approach
Interpretable Reward Redistribution in Reinforcement Learning: A Causal ApproachNeural Information Processing Systems (NeurIPS), 2023
Yudi Zhang
Yali Du
Erdun Gao
Ziyan Wang
Jun Wang
Meng Fang
Mykola Pechenizkiy
CML
263
27
0
28 May 2023
Evolving Connectivity for Recurrent Spiking Neural Networks
Evolving Connectivity for Recurrent Spiking Neural NetworksNeural Information Processing Systems (NeurIPS), 2023
Guan-Bo Wang
Yuhao Sun
Sijie Cheng
Sen Song
136
9
0
28 May 2023
On the Value of Myopic Behavior in Policy Reuse
On the Value of Myopic Behavior in Policy ReuseIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Kang Xu
Chenjia Bai
Delin Qu
Haoran He
Bin Zhao
Zhen Wang
Wei Li
Xuelong Li
182
2
0
28 May 2023
Online Nonstochastic Model-Free Reinforcement Learning
Online Nonstochastic Model-Free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Udaya Ghai
Arushi Gupta
Wenhan Xia
Karan Singh
Elad Hazan
OffRL
286
7
0
27 May 2023
Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?
Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?
Yihe Zhou
Shunyu Liu
Yunpeng Qing
Kaixuan Chen
Tongya Zheng
Jie Song
Mingli Song
342
35
0
27 May 2023
Self-Supervised Reinforcement Learning that Transfers using Random
  Features
Self-Supervised Reinforcement Learning that Transfers using Random FeaturesNeural Information Processing Systems (NeurIPS), 2023
Boyuan Chen
Chuning Zhu
Pulkit Agrawal
Jianchao Tan
Abhishek Gupta
OffRLSSL
247
12
0
26 May 2023
NASimEmu: Network Attack Simulator & Emulator for Training Agents
  Generalizing to Novel Scenarios
NASimEmu: Network Attack Simulator & Emulator for Training Agents Generalizing to Novel Scenarios
Jaromír Janisch
Tomávs Pevný
Viliam Lisý
192
21
0
26 May 2023
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning
  Coordination Problem
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination ProblemAdaptive Agents and Multi-Agent Systems (AAMAS), 2023
Paul Barde
Jakob N. Foerster
Derek Nowrouzezahrai
Amy Zhang
OffRL
231
17
0
26 May 2023
Inferring the Future by Imagining the Past
Inferring the Future by Imagining the PastNeural Information Processing Systems (NeurIPS), 2023
Kartik Chandra
Tony Chen
Tzu-Mao Li
Jonathan Ragan-Kelley
J. Tenenbaum
223
4
0
26 May 2023
IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to
  Reality
IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to Reality
Bingjie Tang
Michael A. Lin
Iretiayo Akinola
Ankur Handa
Gaurav Sukhatme
Fabio Ramos
Dieter Fox
Yashraj S. Narang
OffRL
234
80
0
26 May 2023
A Hierarchical Approach to Population Training for Human-AI
  Collaboration
A Hierarchical Approach to Population Training for Human-AI CollaborationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Yi Loo
Chen Gong
Malika Meghjani
211
9
0
26 May 2023
A Reminder of its Brittleness: Language Reward Shaping May Hinder
  Learning for Instruction Following Agents
A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents
Sukai Huang
Nir Lipovetzky
Trevor Cohn
266
2
0
26 May 2023
Physics-Regulated Deep Reinforcement Learning: Invariant Embeddings
Physics-Regulated Deep Reinforcement Learning: Invariant EmbeddingsInternational Conference on Learning Representations (ICLR), 2023
H. Cao
Y. Mao
L. Sha
Marco Caccamo
PINNAI4CE
233
9
0
26 May 2023
Emergent Agentic Transformer from Chain of Hindsight Experience
Emergent Agentic Transformer from Chain of Hindsight ExperienceInternational Conference on Machine Learning (ICML), 2023
Hao Liu
Pieter Abbeel
OffRL
247
33
0
26 May 2023
Counterfactual Explainer Framework for Deep Reinforcement Learning
  Models Using Policy Distillation
Counterfactual Explainer Framework for Deep Reinforcement Learning Models Using Policy Distillation
Amir Samadi
K. Koufos
Kurt Debattista
M. Dianati
OffRL
251
3
0
25 May 2023
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Coarse-Tuning Models of Code with Reinforcement Learning Feedback
Abhinav C. P. Jain
Chima Adiole
Swarat Chaudhuri
Thomas W. Reps
Chris Jermaine Rice University
ALM
150
3
0
25 May 2023
Learning When to Ask for Help: Efficient Interactive Navigation via
  Implicit Uncertainty Estimation
Learning When to Ask for Help: Efficient Interactive Navigation via Implicit Uncertainty EstimationIEEE International Conference on Robotics and Automation (ICRA), 2023
Ifueko Igbinedion
S. Karaman
287
2
0
25 May 2023
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager: An Open-Ended Embodied Agent with Large Language Models
Guanzhi Wang
Yuqi Xie
Yunfan Jiang
Ajay Mandlekar
Chaowei Xiao
Yuke Zhu
Linxi Fan
Anima Anandkumar
LM&RoSyDa
475
1,192
0
25 May 2023
DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion
  Models
DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023
Ying Fan
Olivia Watkins
Yuqing Du
Hao Liu
Moonkyung Ryu
Craig Boutilier
Pieter Abbeel
Mohammad Ghavamzadeh
Kangwook Lee
Kimin Lee
414
285
0
25 May 2023
Generating Synergistic Formulaic Alpha Collections via Reinforcement
  Learning
Generating Synergistic Formulaic Alpha Collections via Reinforcement LearningKnowledge Discovery and Data Mining (KDD), 2023
Shuo Yu
Hongyan Xue
Xiang Ao
Feiyang Pan
Jia He
Dandan Tu
Qing He
AIFin
224
27
0
25 May 2023
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes
End-to-End Meta-Bayesian Optimisation with Transformer Neural ProcessesNeural Information Processing Systems (NeurIPS), 2023
A. Maraval
Matthieu Zimmer
Antoine Grosnit
H. Ammar
BDL
444
26
0
25 May 2023
All Points Matter: Entropy-Regularized Distribution Alignment for
  Weakly-supervised 3D Segmentation
All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D SegmentationNeural Information Processing Systems (NeurIPS), 2023
Liyao Tang
Zhe Chen
Shanshan Zhao
Chaoyue Wang
Dacheng Tao
264
19
0
25 May 2023
Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep
  Reinforcement Learning
Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep Reinforcement Learning
V. Moschopoulos
Pantelis Kyriakidis
A. Lazaridis
I. Vlahavas
77
1
0
25 May 2023
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement
  Learning
PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning
Jianxiong Li
Xiao Hu
Haoran Xu
Jingjing Liu
Xianyuan Zhan
Ya Zhang
OffRLOnRL
276
27
0
25 May 2023
Control invariant set enhanced safe reinforcement learning: improved
  sampling efficiency, guaranteed stability and robustness
Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustnessComputers and Chemical Engineering (Comput. Chem. Eng.), 2023
Song Bo
B. T. Agyeman
Xunyuan Yin
Jinfeng Liu
OffRL
151
7
0
24 May 2023
Harnessing the Power of Large Language Models for Natural Language to
  First-Order Logic Translation
Harnessing the Power of Large Language Models for Natural Language to First-Order Logic TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yuan Yang
Siheng Xiong
Ali Payani
Ehsan Shareghi
Faramarz Fekri
LRM
152
76
0
24 May 2023
The Larger They Are, the Harder They Fail: Language Models do not
  Recognize Identifier Swaps in Python
The Larger They Are, the Harder They Fail: Language Models do not Recognize Identifier Swaps in PythonAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Antonio Valerio Miceli Barone
Fazl Barez
Ioannis Konstas
Shay B. Cohen
149
39
0
24 May 2023
Inverse Preference Learning: Preference-based RL without a Reward
  Function
Inverse Preference Learning: Preference-based RL without a Reward FunctionNeural Information Processing Systems (NeurIPS), 2023
Joey Hejna
Dorsa Sadigh
OffRL
300
72
0
24 May 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical GuaranteesNeural Information Processing Systems (NeurIPS), 2023
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
322
5
0
24 May 2023
Neural Lyapunov and Optimal Control
Neural Lyapunov and Optimal Control
Daniel Layeghi
Steve Tonneau
M. Mistry
218
0
0
24 May 2023
Previous
123...138139140...227228229
Next