ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,422 papers shown
PMP: Learning to Physically Interact with Environments using Part-wise
  Motion Priors
PMP: Learning to Physically Interact with Environments using Part-wise Motion PriorsInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Jinseok Bae
Jungdam Won
Donggeun Lim
Cheol-Hui Min
Y. Kim
173
46
0
05 May 2023
Causal Policy Gradient for Whole-Body Mobile Manipulation
Causal Policy Gradient for Whole-Body Mobile Manipulation
Jiaheng Hu
Peter Stone
Roberto Martín-Martín
422
33
0
04 May 2023
Single Node Injection Label Specificity Attack on Graph Neural Networks
  via Reinforcement Learning
Single Node Injection Label Specificity Attack on Graph Neural Networks via Reinforcement LearningIEEE Transactions on Computational Social Systems (IEEE TCSS), 2023
Dayuan Chen
Jian Zhang
Yuqian Lv
Jinhuan Wang
Hongjie Ni
Shanqing Yu
Zhen Wang
Qi Xuan
AAML
206
6
0
04 May 2023
Simple Noisy Environment Augmentation for Reinforcement Learning
Simple Noisy Environment Augmentation for Reinforcement Learning
Raad Khraishi
Ramin Okhrati
OffRL
155
1
0
04 May 2023
Maximum Causal Entropy Inverse Constrained Reinforcement Learning
Maximum Causal Entropy Inverse Constrained Reinforcement LearningMachine-mediated learning (ML), 2023
Mattijs Baert
Pietro Mazzaglia
Sam Leroux
Pieter Simoens
CML
257
10
0
04 May 2023
Explainable Reinforcement Learning via a Causal World Model
Explainable Reinforcement Learning via a Causal World ModelInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Zhongwei Yu
Jingqing Ruan
Dengpeng Xing
CML
436
25
0
04 May 2023
An Asynchronous Updating Reinforcement Learning Framework for
  Task-oriented Dialog System
An Asynchronous Updating Reinforcement Learning Framework for Task-oriented Dialog SystemIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Bengisu Cagiltay
Bilge Mutlu
Xiaojie Wang
Caixia Yuan
OffRL
93
0
0
04 May 2023
Toward Evaluating Robustness of Reinforcement Learning with Adversarial
  Policy
Toward Evaluating Robustness of Reinforcement Learning with Adversarial PolicyDependable Systems and Networks (DSN), 2023
Jiawei Zhao
Jiabo He
Florian Schäfer
Xinyu Wang
Anima Anandkumar
Cong Wang
AAML
314
5
0
04 May 2023
Learning Generalizable Pivoting Skills
Learning Generalizable Pivoting SkillsIEEE International Conference on Robotics and Automation (ICRA), 2023
Xiang Zhang
Siddarth Jain
Baichuan Huang
Masayoshi Tomizuka
Diego Romeres
263
18
0
04 May 2023
Sim2Rec: A Simulator-based Decision-making Approach to Optimize
  Real-World Long-term User Engagement in Sequential Recommender Systems
Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender SystemsIEEE International Conference on Data Engineering (ICDE), 2023
Xiong-Hui Chen
Bowei He
Yangze Yu
Qingyang Li
Zhiwei Qin
Wenjie Shang
Jieping Ye
Chen Ma
OffRL
202
14
0
03 May 2023
Gym-preCICE: Reinforcement Learning Environments for Active Flow Control
Gym-preCICE: Reinforcement Learning Environments for Active Flow ControlSoftwareX (SoftwareX), 2023
M. Shams
A. Elsheikh
AI4CE
155
10
0
03 May 2023
Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains
  with Extensible Feet
Enhancing Efficiency of Quadrupedal Locomotion over Challenging Terrains with Extensible FeetIEEE International Conference on Systems, Man and Cybernetics (SMC), 2023
L. Kumar
Sarvesh Sortee
Titas Bera
Ranjan Dasgupta
104
0
0
03 May 2023
Mitigating Approximate Memorization in Language Models via Dissimilarity
  Learned Policy
Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy
Aly M. Kassem
109
2
0
02 May 2023
Get Back Here: Robust Imitation by Return-to-Distribution Planning
Get Back Here: Robust Imitation by Return-to-Distribution Planning
Geoffrey Cideron
B. Tabanpour
Sebastian Curi
Sertan Girgin
Léonard Hussenot
Gabriel Dulac-Arnold
Matthieu Geist
Olivier Pietquin
Robert Dadashi
OOD
261
3
0
02 May 2023
An Improved Yaw Control Algorithm for Wind Turbines via Reinforcement
  Learning
An Improved Yaw Control Algorithm for Wind Turbines via Reinforcement Learning
Alban Puech
Jesse Read
77
6
0
02 May 2023
CALM: Conditional Adversarial Latent Models for Directable Virtual
  Characters
CALM: Conditional Adversarial Latent Models for Directable Virtual CharactersInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Chen Tessler
Yoni Kasten
Yunrong Guo
Shie Mannor
Gal Chechik
Xue Bin Peng
VGenLM&Ro
206
105
0
02 May 2023
Multi-Task Multi-Behavior MAP-Elites
Multi-Task Multi-Behavior MAP-Elites
Timothée Anne
Jean-Baptiste Mouret
MoE
152
8
0
02 May 2023
Early Classifying Multimodal Sequences
Early Classifying Multimodal SequencesInternational Conference on Multimodal Interaction (ICMI), 2023
Alexander Cao
J. Utke
Diego Klabjan
144
0
0
02 May 2023
ArK: Augmented Reality with Knowledge Interactive Emergent Ability
ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Qiuyuan Huang
Jinho Park
Abhinav Gupta
Paul N. Bennett
Ran Gong
...
Baolin Peng
O. Mohammed
C. Pal
Yejin Choi
Jianfeng Gao
194
8
0
01 May 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural
  Language Generation
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes
Aman Madaan
Emmy Liu
António Farinhas
Pedro Henrique Martins
...
José G. C. de Souza
Shuyan Zhou
Tongshuang Wu
Graham Neubig
Marcely Zanon Boito
ALM
304
69
0
01 May 2023
Representations and Exploration for Deep Reinforcement Learning using
  Singular Value Decomposition
Representations and Exploration for Deep Reinforcement Learning using Singular Value DecompositionInternational Conference on Machine Learning (ICML), 2023
Yash Chandak
S. Thakoor
Z. Guo
Yunhao Tang
Rémi Munos
Will Dabney
Diana Borsa
287
6
0
01 May 2023
BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge
  Platforms
BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms
Ziyang Zhang
Huan Li
Yang Zhao
Changyao Lin
Jie Liu
162
5
0
01 May 2023
Learning Achievement Structure for Structured Exploration in Domains
  with Sparse Reward
Learning Achievement Structure for Structured Exploration in Domains with Sparse RewardInternational Conference on Learning Representations (ICLR), 2023
Zihan Zhou
Animesh Garg
OffRL
254
4
0
30 Apr 2023
Modality-invariant Visual Odometry for Embodied Vision
Modality-invariant Visual Odometry for Embodied VisionComputer Vision and Pattern Recognition (CVPR), 2023
Marius Memmel
Roman Bachmann
Amir Zamir
321
13
0
29 Apr 2023
A Coupled Flow Approach to Imitation Learning
A Coupled Flow Approach to Imitation LearningInternational Conference on Machine Learning (ICML), 2023
G. Freund
Elad Sarafian
Sarit Kraus
OOD
187
15
0
29 Apr 2023
Semi-Infinitely Constrained Markov Decision Processes and Efficient
  Reinforcement Learning
Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Liangyu Zhang
Yang Peng
Wenhao Yang
Zhihua Zhang
159
1
0
29 Apr 2023
X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs
  Transformation
X-RLflow: Graph Reinforcement Learning for Neural Network Subgraphs TransformationConference on Machine Learning and Systems (MLSys), 2023
Guoliang He
Sean Parker
Eiko Yoneki
169
6
0
28 Apr 2023
Learning adaptive manipulation of objects with revolute joint: A case
  study on varied cabinet doors opening
Learning adaptive manipulation of objects with revolute joint: A case study on varied cabinet doors openingCybersecurity and Cyberforensics Conference (CC), 2023
Hongxiang Yu
Dashun Guo
Zhongxiang Zhou
Yue Wang
R. Xiong
280
1
0
28 Apr 2023
Adversarial Policy Optimization in Deep Reinforcement Learning
Adversarial Policy Optimization in Deep Reinforcement Learning
Md Masudur Rahman
Yexiang Xue
AAML
112
0
0
27 Apr 2023
Learning Environment for the Air Domain (LEAD)
Learning Environment for the Air Domain (LEAD)Online World Conference on Soft Computing in Industrial Applications (WSCIA), 2023
Andreas Strand
Patrick Ribu Gorton
M. Asprusten
K. Brathen
136
2
0
27 Apr 2023
Convergence of Adam Under Relaxed Assumptions
Convergence of Adam Under Relaxed AssumptionsNeural Information Processing Systems (NeurIPS), 2023
Haochuan Li
Alexander Rakhlin
Ali Jadbabaie
403
92
0
27 Apr 2023
CROP: Towards Distributional-Shift Robust Reinforcement Learning using
  Compact Reshaped Observation Processing
CROP: Towards Distributional-Shift Robust Reinforcement Learning using Compact Reshaped Observation ProcessingInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Philipp Altmann
Fabian Ritz
Leonard Feuchtinger
Jonas Nusslein
Claudia Linnhoff-Popien
Thomy Phan
OODOffRL
256
5
0
26 Apr 2023
Optimizing Energy Efficiency in Metro Systems Under Uncertainty
  Disturbances Using Reinforcement Learning
Optimizing Energy Efficiency in Metro Systems Under Uncertainty Disturbances Using Reinforcement Learning
Haiqin Xie
Cheng Wang
Shicheng Li
Yue J. Zhang
Shanshan Wang
130
0
0
26 Apr 2023
Can Agents Run Relay Race with Strangers? Generalization of RL to
  Out-of-Distribution Trajectories
Can Agents Run Relay Race with Strangers? Generalization of RL to Out-of-Distribution TrajectoriesInternational Conference on Learning Representations (ICLR), 2023
Li-Cheng Lan
Huan Zhang
Cho-Jui Hsieh
OODD
219
11
0
26 Apr 2023
Multi-criteria Hardware Trojan Detection: A Reinforcement Learning
  Approach
Multi-criteria Hardware Trojan Detection: A Reinforcement Learning ApproachMidwest Symposium on Circuits and Systems (MWSCAS), 2023
Amin Sarihi
Peter Jamieson
Ahmad Patooghy
Abdel-Hameed A. Badawy
73
7
0
26 Apr 2023
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework
Bin Wang
Xinnian Liang
Jian Yang
Huijia Huang
Shuangzhi Wu
Peihao Wu
Lu Lu
Zejun Ma
Zhoujun Li
LLMAGKELMRALM
380
60
0
26 Apr 2023
Dynamic Datasets and Market Environments for Financial Reinforcement
  Learning
Dynamic Datasets and Market Environments for Financial Reinforcement LearningMachine-mediated learning (ML), 2023
Xiao-Yang Liu
Ziyi Xia
Hongyang Yang
Jiechao Gao
Daochen Zha
Ming Zhu
Chris Wang
Zhaoran Wang
Jian Guo
OffRL
221
35
0
25 Apr 2023
Roll-Drop: accounting for observation noise with a single parameter
Roll-Drop: accounting for observation noise with a single parameterConference on Learning for Dynamics & Control (L4DC), 2023
Luigi Campanaro
D. Martini
Siddhant Gangapurwala
W. Merkt
Ioannis Havoutis
SyDa
215
5
0
25 Apr 2023
The Update-Equivalence Framework for Decision-Time Planning
The Update-Equivalence Framework for Decision-Time PlanningInternational Conference on Learning Representations (ICLR), 2023
Samuel Sokota
Gabriele Farina
David J. Wu
Hengyuan Hu
Kevin A. Wang
J. Zico Kolter
Noam Brown
293
5
0
25 Apr 2023
Proximal Curriculum for Reinforcement Learning Agents
Proximal Curriculum for Reinforcement Learning Agents
Georgios Tzannetos
Bárbara Gomes Ribeiro
Parameswaran Kamalaruban
Adish Singla
230
14
0
25 Apr 2023
Zero-shot Transfer Learning of Driving Policy via Socially Adversarial
  Traffic Flow
Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow
Dongkun Zhang
Jintao Xue
Yuxiang Cui
Yunkai Wang
Eryun Liu
Wei Jing
Junbo Chen
R. Xiong
Yue Wang
233
1
0
25 Apr 2023
Synthesizing Stable Reduced-Order Visuomotor Policies for Nonlinear
  Systems via Sums-of-Squares Optimization
Synthesizing Stable Reduced-Order Visuomotor Policies for Nonlinear Systems via Sums-of-Squares OptimizationIEEE Conference on Decision and Control (CDC), 2023
Glen Chou
Russ Tedrake
329
3
0
24 Apr 2023
Stubborn: An Environment for Evaluating Stubbornness between Agents with
  Aligned Incentives
Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives
Ram Rachum
Yonatan Nakar
Reuth Mirsky
58
1
0
24 Apr 2023
Parallel bootstrap-based on-policy deep reinforcement learning for
  continuous flow control applications
Parallel bootstrap-based on-policy deep reinforcement learning for continuous flow control applicationsFluids (Fluids), 2023
J. Viquerat
E. Hachem
174
3
0
24 Apr 2023
Towards Effective and Interpretable Human-Agent Collaboration in MOBA
  Games: A Communication Perspective
Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication PerspectiveInternational Conference on Learning Representations (ICLR), 2023
Yiming Gao
Feiyu Liu
Liang Wang
Zhenjie Lian
Weixuan Wang
...
Jiawei Wang
Qiang Fu
Wei Yang
Lanxiao Huang
Wei Liu
176
10
0
23 Apr 2023
Differentiate ChatGPT-generated and Human-written Medical Texts
Differentiate ChatGPT-generated and Human-written Medical TextsJMIR Medical Education (JMIR Med Educ), 2023
Wenxiong Liao
Zheng Liu
Haixing Dai
Shaochen Xu
Zihao Wu
...
Xiaoke Huang
Dajiang Zhu
Hongmin Cai
Tianming Liu
Xiang Li
LM&MADeLMOMedImAI4MH
165
78
0
23 Apr 2023
LayerNAS: Neural Architecture Search in Polynomial Complexity
LayerNAS: Neural Architecture Search in Polynomial Complexity
Yicheng Fan
Dana Alon
Jingyue Shen
Daiyi Peng
Keshav Kumar
Yun Long
Xin Wang
Fotis Iliopoulos
Da-Cheng Juan
Erik Vee
157
5
0
23 Apr 2023
AutoVRL: A High Fidelity Autonomous Ground Vehicle Simulator for
  Sim-to-Real Deep Reinforcement Learning
AutoVRL: A High Fidelity Autonomous Ground Vehicle Simulator for Sim-to-Real Deep Reinforcement LearningIFAC-PapersOnLine (IFAC-PapersOnLine), 2023
Shathushan Sivashangaran
Apoorva Khairnar
A. Eskandarian
186
8
0
22 Apr 2023
AutoNeRF: Training Implicit Scene Representations with Autonomous Agents
AutoNeRF: Training Implicit Scene Representations with Autonomous AgentsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Pierre Marza
L. Matignon
Olivier Simonin
Dhruv Batra
Christian Wolf
Devendra Singh Chaplot
OffRL
219
13
0
21 Apr 2023
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning
A Cubic-regularized Policy Newton Algorithm for Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Mizhaan Prajit Maniyar
Akash Mondal
Prashanth L.A.
S. Bhatnagar
191
4
0
21 Apr 2023
Previous
123...141142143...227228229
Next