ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.06347
  4. Cited By
Proximal Policy Optimization Algorithms
v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,424 papers shown
PyTAG: Challenges and Opportunities for Reinforcement Learning in
  Tabletop Games
PyTAG: Challenges and Opportunities for Reinforcement Learning in Tabletop Games
Martin Balla
G. E. Long
Dominik Jeurissen
J. Goodman
Raluca D. Gaina
Diego Perez-Liebana
LMTDOffRLOnRL
203
3
0
19 Jul 2023
Amortised Design Optimization for Item Response Theory
Amortised Design Optimization for Item Response TheoryInternational Conference on Artificial Intelligence in Education (AIED), 2023
Antti Keurulainen
Isak Westerlund
Oskar Keurulainen
Andrew Howes
147
0
0
19 Jul 2023
Amortised Experimental Design and Parameter Estimation for User Models
  of Pointing
Amortised Experimental Design and Parameter Estimation for User Models of PointingInternational Conference on Human Factors in Computing Systems (CHI), 2023
Antti Keurulainen
Isak Westerlund
Oskar Keurulainen
Andrew Howes
203
7
0
19 Jul 2023
Reinforcement Learning for Credit Index Option Hedging
Reinforcement Learning for Credit Index Option Hedging
Francesco Mandelli
Marco Pinciroli
Michele Trapletti
Edoardo Vittori
96
3
0
19 Jul 2023
Scaling Laws for Imitation Learning in Single-Agent Games
Scaling Laws for Imitation Learning in Single-Agent Games
Jens Tuyls
Dhruv Madeka
Kari Torkkola
Dean Phillips Foster
Karthik Narasimhan
Sham Kakade
277
8
0
18 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
8.4K
15,388
0
18 Jul 2023
Task Space Control of Hydraulic Construction Machines using
  Reinforcement Learning
Task Space Control of Hydraulic Construction Machines using Reinforcement LearningInternational Workshop on Human Friendly Robotics (HFR), 2023
Hyung-Joo Lee
S. Brell-Çokcan
111
0
0
18 Jul 2023
Learning Dynamic Attribute-factored World Models for Efficient
  Multi-object Reinforcement Learning
Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Fan Feng
Sara Magliacane
OffRLOCL
253
15
0
18 Jul 2023
REX: Rapid Exploration and eXploitation for AI Agents
REX: Rapid Exploration and eXploitation for AI Agents
Rithesh Murthy
Shelby Heinecke
Juan Carlos Niebles
Zhiwei Liu
Le Xue
...
Ran Xu
P. Mùi
Haiquan Wang
Caiming Xiong
Silvio Savarese
OffRL
238
12
0
18 Jul 2023
IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on
  Analyses of Interestingness
IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on Analyses of Interestingness
Pedro Sequeira
Melinda Gervasio
186
2
0
18 Jul 2023
Natural Actor-Critic for Robust Reinforcement Learning with Function
  Approximation
Natural Actor-Critic for Robust Reinforcement Learning with Function ApproximationNeural Information Processing Systems (NeurIPS), 2023
Ruida Zhou
Tao-Wen Liu
Min Cheng
D. Kalathil
P. R. Kumar
Chao Tian
379
39
0
17 Jul 2023
An Alternative to Variance: Gini Deviation for Risk-averse Policy
  Gradient
An Alternative to Variance: Gini Deviation for Risk-averse Policy GradientNeural Information Processing Systems (NeurIPS), 2023
Yudong Luo
Guiliang Liu
Pascal Poupart
Yangchen Pan
352
12
0
17 Jul 2023
Accelerating Cutting-Plane Algorithms via Reinforcement Learning
  Surrogates
Accelerating Cutting-Plane Algorithms via Reinforcement Learning SurrogatesAAAI Conference on Artificial Intelligence (AAAI), 2023
Kyle Mana
Fernando Acero
Stephen Mak
Parisa Zehtabi
Michael Cashmore
Daniele Magazzeni
Manuela Veloso
191
0
0
17 Jul 2023
Quarl: A Learning-Based Quantum Circuit Optimizer
Quarl: A Learning-Based Quantum Circuit Optimizer
Zikun Li
Jin-Ye Peng
Yixuan Mei
Sina Lin
Yi Wu
Oded Padon
Zhi-Long Jia
110
35
0
17 Jul 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural
  Language Explanations
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language ExplanationsInternational Conference on Machine Learning (ICML), 2023
Yanda Chen
Ruiqi Zhong
Narutatsu Ri
Chen Zhao
He He
Jacob Steinhardt
Zhou Yu
Kathleen McKeown
LRM
234
76
0
17 Jul 2023
CoAD: Automatic Diagnosis through Symptom and Disease Collaborative
  Generation
CoAD: Automatic Diagnosis through Symptom and Disease Collaborative GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Huimin Wang
Wai-Chung Kwan
Kam-Fai Wong
Yefeng Zheng
MedIm
187
12
0
17 Jul 2023
Towards Self-Assembling Artificial Neural Networks through Neural
  Developmental Programs
Towards Self-Assembling Artificial Neural Networks through Neural Developmental Programs
Elias Najarro
Shyam Sudhakaran
S. Risi
174
22
0
17 Jul 2023
Enabling Efficient, Reliable Real-World Reinforcement Learning with
  Approximate Physics-Based Models
Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based ModelsConference on Robot Learning (CoRL), 2023
T. Westenbroek
Jacob Levy
David Fridovich-Keil
235
0
0
16 Jul 2023
POMDP inference and robust solution via deep reinforcement learning: An
  application to railway optimal maintenance
POMDP inference and robust solution via deep reinforcement learning: An application to railway optimal maintenanceMachine-mediated learning (ML), 2023
Giacomo Arcieri
C. Hoelzl
Oliver Schwery
D. Štraub
K. Papakonstantinou
Eleni Chatzi
188
28
0
16 Jul 2023
The SocialAI School: Insights from Developmental Psychology Towards
  Artificial Socio-Cultural Agents
The SocialAI School: Insights from Developmental Psychology Towards Artificial Socio-Cultural Agents
Grgur Kovač
Rémy Portelas
Peter Ford Dominey
Pierre-Yves Oudeyer
177
28
0
15 Jul 2023
SafeDreamer: Safe Reinforcement Learning with World Models
SafeDreamer: Safe Reinforcement Learning with World ModelsInternational Conference on Learning Representations (ICLR), 2023
Weidong Huang
Jiaming Ji
Borong Zhang
Chunhe Xia
Yao-Chun Yang
OffRL
206
35
0
14 Jul 2023
Robotic Manipulation Datasets for Offline Compositional Reinforcement
  Learning
Robotic Manipulation Datasets for Offline Compositional Reinforcement Learning
Marcel Hussing
Jorge Armando Mendez Mendez
Anisha Singrodia
Cassandra Kent
Eric Eaton
OffRL
339
9
0
13 Jul 2023
Learning Multiple Coordinated Agents under Directed Acyclic Graph
  Constraints
Learning Multiple Coordinated Agents under Directed Acyclic Graph Constraints
Jaeyeon Jang
Diego Klabjan
Han Liu
Nital S. Patel
Xiuqi Li
Balakrishnan Ananthanarayanan
Husam Dauod
Tzung-Han Juang
109
1
0
13 Jul 2023
Why Guided Dialog Policy Learning performs well? Understanding the role
  of adversarial learning and its alternative
Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative
Sho Shimoyama
Tetsuro Morimura
Kenshi Abe
Toda Takamichi
Yuta Tomomatsu
Masakazu Sugiyama
Asahi Hentona
Yuuki Azuma
Hirotaka Ninomiya
OffRL
140
1
0
13 Jul 2023
Aeolus Ocean -- A simulation environment for the autonomous
  COLREG-compliant navigation of Unmanned Surface Vehicles using Deep
  Reinforcement Learning and Maritime Object Detection
Aeolus Ocean -- A simulation environment for the autonomous COLREG-compliant navigation of Unmanned Surface Vehicles using Deep Reinforcement Learning and Maritime Object Detection
A. Vekinis
S. Perantonis
213
1
0
13 Jul 2023
Prescriptive Process Monitoring Under Resource Constraints: A
  Reinforcement Learning Approach
Prescriptive Process Monitoring Under Resource Constraints: A Reinforcement Learning Approach
Mahmoud Shoush
Marlon Dumas
260
7
0
13 Jul 2023
Bi-Touch: Bimanual Tactile Manipulation with Sim-to-Real Deep
  Reinforcement Learning
Bi-Touch: Bimanual Tactile Manipulation with Sim-to-Real Deep Reinforcement LearningIEEE Robotics and Automation Letters (RA-L), 2023
Yijiong Lin
Alex Church
Max Yang
Haoran Li
John Lloyd
Dandan Zhang
Nathan Lepora
237
45
0
12 Jul 2023
Learning Decentralized Partially Observable Mean Field Control for
  Artificial Collective Behavior
Learning Decentralized Partially Observable Mean Field Control for Artificial Collective BehaviorInternational Conference on Learning Representations (ICLR), 2023
Kai Cui
Sascha H. Hauck
Christian Fabian
Heinz Koeppl
316
10
0
12 Jul 2023
Maneuver Decision-Making Through Automatic Curriculum Reinforcement
  Learning Without Handcrafted Reward functions
Maneuver Decision-Making Through Automatic Curriculum Reinforcement Learning Without Handcrafted Reward functionsApplied Sciences (Appl. Sci.), 2023
Hong-Peng Zhang
136
5
0
12 Jul 2023
Learning Hierarchical Interactive Multi-Object Search for Mobile
  Manipulation
Learning Hierarchical Interactive Multi-Object Search for Mobile ManipulationIEEE Robotics and Automation Letters (RA-L), 2023
F. Schmalstieg
Daniel Honerkamp
Tim Welschehold
Abhinav Valada
412
25
0
12 Jul 2023
Transformers in Reinforcement Learning: A Survey
Transformers in Reinforcement Learning: A Survey
Pranav Agarwal
A. Rahman
P. St-Charles
Simon J. D. Prince
Samira Ebrahimi Kahou
OffRL
252
26
0
12 Jul 2023
Automatically Reconciling the Trade-off between Prediction Accuracy and
  Earliness in Prescriptive Business Process Monitoring
Automatically Reconciling the Trade-off between Prediction Accuracy and Earliness in Prescriptive Business Process MonitoringInformation Systems (Inf. Syst.), 2023
Andreas Metzger
Tristan Kley
Aristide Rothweiler
Klaus Pohl
168
7
0
12 Jul 2023
Prompt Generate Train (PGT): Few-shot Domain Adaption of Retrieval
  Augmented Generation Models for Open Book Question-Answering
Prompt Generate Train (PGT): Few-shot Domain Adaption of Retrieval Augmented Generation Models for Open Book Question-Answering
C. Krishna
RALM
145
1
0
12 Jul 2023
PID-Inspired Inductive Biases for Deep Reinforcement Learning in
  Partially Observable Control Tasks
PID-Inspired Inductive Biases for Deep Reinforcement Learning in Partially Observable Control TasksNeural Information Processing Systems (NeurIPS), 2023
I. Char
J. Schneider
265
7
0
12 Jul 2023
Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building
Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building
Jaedong Hwang
Zhang-Wei Hong
Eric Chen
Akhilan Boopathy
Pulkit Agrawal
Ila Fiete
258
3
0
11 Jul 2023
A Survey From Distributed Machine Learning to Distributed Deep Learning
A Survey From Distributed Machine Learning to Distributed Deep Learning
Mohammad Dehghani
Zahra Yazdanparast
314
0
0
11 Jul 2023
Secrets of RLHF in Large Language Models Part I: PPO
Secrets of RLHF in Large Language Models Part I: PPO
Rui Zheng
Jiajun Sun
Songyang Gao
Yuan Hua
Wei Shen
...
Hang Yan
Tao Gui
Tao Gui
Xipeng Qiu
Xuanjing Huang
ALMOffRL
328
236
0
11 Jul 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a
  Human-Preference Dataset
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference DatasetNeural Information Processing Systems (NeurIPS), 2023
Jiaming Ji
Mickel Liu
Juntao Dai
Xuehai Pan
Chi Zhang
Ce Bian
Chi Zhang
Ruiyang Sun
Yizhou Wang
Yaodong Yang
ALM
406
724
0
10 Jul 2023
Assessing the efficacy of large language models in generating accurate
  teacher responses
Assessing the efficacy of large language models in generating accurate teacher responsesWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2023
Yann Hicke
Abhishek Masand
Wentao Guo
Tushaar Gangavarapu
ELMAI4Ed
165
13
0
09 Jul 2023
ScriptWorld: Text Based Environment For Learning Procedural Knowledge
ScriptWorld: Text Based Environment For Learning Procedural KnowledgeInternational Joint Conference on Artificial Intelligence (IJCAI), 2023
Abhinav Joshi
A. Ahmad
Umang Pandey
Ashutosh Modi
175
8
0
08 Jul 2023
MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot
  Reinforcement Learning Algorithms
MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot Reinforcement Learning AlgorithmsInternational Symposium on Multi-Robot and Multi-Agent Systems (MRS), 2023
R. Torbati
Shubbham Lohiya
Shivika Singh
Meher Shashwat Nigam
Harish Ravichandar
355
3
0
08 Jul 2023
RADAR: Robust AI-Text Detection via Adversarial Learning
RADAR: Robust AI-Text Detection via Adversarial LearningNeural Information Processing Systems (NeurIPS), 2023
Xiaomeng Hu
Pin-Yu Chen
Tsung-Yi Ho
DeLMO
339
197
0
07 Jul 2023
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained
  Networks
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained NetworksIEEE International Conference on Robotics and Automation (ICRA), 2023
Xingyu Lin
John So
Sashwat Mahalingam
Fangchen Liu
Pieter Abbeel
SSL
334
34
0
07 Jul 2023
Discovering Hierarchical Achievements in Reinforcement Learning via
  Contrastive Learning
Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive LearningNeural Information Processing Systems (NeurIPS), 2023
Seungyong Moon
Junyoung Yeom
Bumsoo Park
Hyun Oh Song
OffRL
398
8
0
07 Jul 2023
Push Past Green: Learning to Look Behind Plant Foliage by Moving It
Push Past Green: Learning to Look Behind Plant Foliage by Moving ItConference on Robot Learning (CoRL), 2023
Xiaoyun Zhang
Saurabh Gupta
334
6
0
06 Jul 2023
Learning Multi-Agent Intention-Aware Communication for Optimal
  Multi-Order Execution in Finance
Learning Multi-Agent Intention-Aware Communication for Optimal Multi-Order Execution in FinanceKnowledge Discovery and Data Mining (KDD), 2023
Yuchen Fang
Zhen-Yu Tang
Kan Ren
Yuante Li
Li Zhao
Jiang Bian
Dongsheng Li
Weinan Zhang
Yong Yu
Tie-Yan Liu
171
13
0
06 Jul 2023
Sequential Neural Barriers for Scalable Dynamic Obstacle Avoidance
Sequential Neural Barriers for Scalable Dynamic Obstacle AvoidanceIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Hong-Den Yu
Chiaki Hirayama
Chenning Yu
Sylvia Herbert
Sicun Gao
225
22
0
06 Jul 2023
ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource
  Allocation
ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource AllocationInternational Conference on Machine Learning, Optimization, and Data Science (MOD), 2023
Abhijeet Pendyala
Justin Dettmer
Tobias Glasmachers
Asma Atamna
OffRL
111
8
0
06 Jul 2023
A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
  Observations
A Neuromorphic Architecture for Reinforcement Learning from Real-Valued ObservationsIEEE International Joint Conference on Neural Network (IJCNN), 2023
Sergio Chevtchenko
Y. Bethi
Teresa B Ludermir
Saeed Afshar
OffRL
283
1
0
06 Jul 2023
Safe & Accurate at Speed with Tendons: A Robot Arm for Exploring Dynamic
  Motion
Safe & Accurate at Speed with Tendons: A Robot Arm for Exploring Dynamic Motion
Simon Guist
Jan Schneider
Hao Ma
Tianyu Cui
V. Berenz
...
Felix Gruninger
M. Muhlebach
J. Fiene
Bernhard Schölkopf
Le Chen
319
10
0
05 Jul 2023
Previous
123...133134135...227228229
Next
Page 134 of 229
Pageof 229