v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,421 papers shown

Low-Switching Policy Gradient with Exploration via Online Sensitivity SamplingInternational Conference on Machine Learning (ICML), 2023

210

15 Jun 2023

Hierarchical Planning and Control for Box Loco-ManipulationProceedings of the ACM on Computer Graphics and Interactive Techniques (PACMCGIT), 2023

274

15 Jun 2023

Recurrent Action Transformer with Memory

393

15 Jun 2023

Inroads into Autonomous Network Defence using Explained Reinforcement Learning

257

15 Jun 2023

Semantic HELM: A Human-Readable Memory for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

300

15 Jun 2023

Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

358

15 Jun 2023

Datasets and Benchmarks for Offline Safe Reinforcement Learning

...

Wenhao Yu

Tingnan Zhang

Jie Tan

Ding Zhao

OffRL

303

15 Jun 2023

Generalizable Resource Scaling of 5G Slices using Constrained Reinforcement LearningIEEE/IFIP Network Operations and Management Symposium (NOMS), 2023

156

15 Jun 2023

Optimal Exploration for Model-Based RL in Nonlinear SystemsNeural Information Processing Systems (NeurIPS), 2023

Andrew Wagenmaker

Guanya Shi

Kevin Jamieson

257

15 Jun 2023

Predictive Maneuver Planning with Deep Reinforcement Learning (PMP-DRL) for comfortable and safe autonomous driving

Jayabrata Chowdhury

Vishruth Veerendranath

Suresh Sundaram

N. Sundararajan

117

15 Jun 2023

Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Qingyu Tan

Hwee Tou Ng

Lidong Bing

LRM

346

15 Jun 2023

ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture DesignInternational Symposium on Computer Architecture (ISCA), 2023

...

242

15 Jun 2023

Deep Generative Models for Decision-Making and Control

Michael Janner

292

15 Jun 2023

DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control

349

15 Jun 2023

Integrating machine learning paradigms and mixed-integer model predictive control for irrigation scheduling

14 Jun 2023

OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments

Quentin Delfosse

Johannes Czech

Bjarne Gregori

Sebastian Sztwiertnia

Kristian Kersting

498

14 Jun 2023

Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models

Xiaotao Gu

254

14 Jun 2023

Hierarchical Task Network Planning for Facilitating Cooperative Multi-Agent Reinforcement Learning

Xuechen Mu

H. Zhuo

Chong Chen

Kai Zhang

Chao Yu

Jianye Hao

226

14 Jun 2023

A reinforcement learning strategy for p-adaptation in high order solversResults in Engineering (RE), 2023

120

14 Jun 2023

MiniLLM: Knowledge Distillation of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

640

14 Jun 2023

Multi-market Energy Optimization with Renewables via Reinforcement Learning

Lucien Werner

Peeyush Kumar

13 Jun 2023

AutoML in the Age of Large Language Models: Current Challenges, Future Opportunities and Risks

...

Daphne Theodorakopoulos

Tanja Tornede

Henning Wachsmuth

Marius Lindauer

325

13 Jun 2023

Can ChatGPT Enable ITS? The Case of Mixed Traffic Control via Reinforcement Learning

Michael Villarreal

Bibek Poudel

Weizi Li

212

13 Jun 2023

Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes

Luca Sabbioni

Francesco Corda

Marcello Restelli

183

13 Jun 2023

Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution

Dengyu Zhang

Guo-Niu Zhu

Qingrui Zhang

203

13 Jun 2023

SayTap: Language to Quadrupedal LocomotionConference on Robot Learning (CoRL), 2023

Yujin Tang

Wenhao Yu

Jie Tan

Heiga Zen

Aleksandra Faust

Tatsuya Harada

304

13 Jun 2023

DenseLight: Efficient Control for Large-scale Traffic Signals with Dense FeedbackInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Yang Liu

144

13 Jun 2023

Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-SecondComputer Vision and Pattern Recognition (CVPR), 2023

Vincent-Pierre Berges

Andrew Szot

Devendra Singh Chaplot

246

13 Jun 2023

Unified Off-Policy Learning to Rank: a Reinforcement Learning PerspectiveNeural Information Processing Systems (NeurIPS), 2023

Mengdi Wang

382

13 Jun 2023

Robust Reinforcement Learning through Efficient Adversarial Herding

165

12 Jun 2023

Online Prototype Alignment for Few-shot Policy TransferInternational Conference on Machine Learning (ICML), 2023

...

233

12 Jun 2023

Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications

262

11 Jun 2023

Zero-Shot Wireless Indoor Navigation through Physics-Informed Reinforcement LearningIEEE Open Journal of the Communications Society (JOCS), 2023

249

11 Jun 2023

CoTran: An LLM-based Code Translator using Reinforcement Learning with Feedback from Compiler and Symbolic ExecutionEuropean Conference on Artificial Intelligence (ECAI), 2023

363

11 Jun 2023

Reinforcement Learning with Parameterized Manipulation Primitives for Robotic Assembly

N. Vuong

Quang Pham

158

11 Jun 2023

Contact Reduction with Bounded Stiffness for Robust Sim-to-Real Transfer of Robot AssemblyIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023

N. Vuong

Quang Pham

148

11 Jun 2023

A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence

Kexuan Wang

An Liu

Baishuo Liu

166

10 Jun 2023

Long-term Microscopic Traffic Simulation with History-Masked Multi-agent Imitation Learning

143

10 Jun 2023

How to Learn and Generalize From Three Minutes of Data: Physics-Constrained and Uncertainty-Aware Neural Stochastic Differential EquationsConference on Robot Learning (CoRL), 2023

Franck Djeumou

Cyrus Neary

Ufuk Topcu

DiffM

275

10 Jun 2023

iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning

364

09 Jun 2023

Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments

Jonathon Schwartz

H. Kurniawati

Marcus Hutter

OffRL LRM

171

09 Jun 2023

Approximate information state based convergence analysis of recurrent Q-learning

188

09 Jun 2023

Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst KernelInternational Conference on Machine Learning (ICML), 2023

297

09 Jun 2023

An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling Problems Based on Constraint ProgrammingInternational Conference on Automated Planning and Scheduling (ICAPS), 2023

Pierre Tassel

Martin Gebser

Konstantin Schekotihin

103

09 Jun 2023

Finite-Time Analysis of Minimax Q-Learning for Two-Player Zero-Sum Markov Games: Switching System Approach

Dong-hwan Lee

248

09 Jun 2023

QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse SensorsInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023

209

09 Jun 2023

Robustness Testing for Multi-Agent Reinforcement Learning: State Perturbations on Critical AgentsEuropean Conference on Artificial Intelligence (ECAI), 2023

Ziyuan Zhou

Guanjun Liu

AAML

160

09 Jun 2023

A newborn embodied Turing test for view-invariant object recognitionAnnual Meeting of the Cognitive Science Society (CogSci), 2023

142

08 Jun 2023

ChatGPT is fun, but it is not funny! Humor is still challenging Large Language ModelsWorkshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2023

Sophie F. Jentzsch

Kristian Kersting

LRM

154

07 Jun 2023

Long-form analogies generated by chatGPT lack human-like psycholinguistic propertiesAnnual Meeting of the Cognitive Science Society (CogSci), 2023

S. M. Seals

V. Shalin

169

07 Jun 2023