v1v2 (latest)

Proximal Policy Optimization Algorithms

20 July 2017

Papers citing "Proximal Policy Optimization Algorithms"

50 / 11,421 papers shown

Efficient Learning of Urban Driving Policies Using Bird's-Eye-View State Representations

Raphael Trumpp

M. Büchner

Abhinav Valada

Marco Caccamo

181

31 May 2023

Dynamic Neighborhood Construction for Structured Large Discrete Action SpacesInternational Conference on Learning Representations (ICLR), 2023

332

31 May 2023

Adaptive and Explainable Deployment of Navigation Skills via Hierarchical Deep Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2023

Kyowoon Lee

Seongun Kim

Jaesik Choi

224

31 May 2023

Symmetry-Aware Robot Design with Structured SubgroupsInternational Conference on Machine Learning (ICML), 2023

Heng Dong

Junyu Zhang

Tonghan Wang

Chongjie Zhang

183

31 May 2023

On the Linear Convergence of Policy Gradient under Hadamard ParameterizationInformation and Inference A Journal of the IMA (JIII), 2023

Jiacai Liu

Jinchi Chen

Ke Wei

213

31 May 2023

Representation-Driven Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023

Ofir Nabati

Guy Tennenholtz

Shie Mannor

291

31 May 2023

NetHack is Hard to HackNeural Information Processing Systems (NeurIPS), 2023

Ulyana Piterbarg

Lerrel Pinto

Rob Fergus

259

30 May 2023

Subequivariant Graph Reinforcement Learning in 3D EnvironmentsInternational Conference on Machine Learning (ICML), 2023

156

30 May 2023

Improving the performance of Learned Controllers in Behavior Trees using Value Function Estimates at Switching BoundariesIEEE Robotics and Automation Letters (RA-L), 2023

Mart Kartasev

Petter Ögren

30 May 2023

Policy Optimization for Continuous Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

351

30 May 2023

Perimeter Control Using Deep Reinforcement Learning: A Model-free Approach towards Homogeneous Flow Rate Optimization

Xiaocan Li

Baher Abdulhai

29 May 2023

Direct Preference Optimization: Your Language Model is Secretly a Reward ModelNeural Information Processing Systems (NeurIPS), 2023

Christopher D. Manning

Chelsea Finn

ALM

883

6,769

29 May 2023

RLAD: Reinforcement Learning from Pixels for Autonomous Driving in Urban EnvironmentsIEEE Transactions on Automation Science and Engineering (IEEE TASE), 2023

Daniel Coelho

Miguel Oliveira

Vítor M. F. Santos

122

29 May 2023

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic AlgorithmInternational Conference on Machine Learning (ICML), 2023

149

29 May 2023

Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations

167

29 May 2023

A Hybrid Framework of Reinforcement Learning and Convex Optimization for UAV-Based Autonomous Metaverse Data CollectionIEEE Network (IEEE Netw.), 2023

Peiyuan Si

Liangxin Qian

Jun Zhao

Kwok-Yan Lam

29 May 2023

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023

314

29 May 2023

Continual Task Allocation in Meta-Policy Network via Sparse PromptingInternational Conference on Machine Learning (ICML), 2023

306

29 May 2023

Toward Fine Contact Interactions: Learning to Control Normal Contact Force with Limited InformationIEEE International Conference on Robotics and Automation (ICRA), 2023

126

29 May 2023

RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged LocomotionIEEE Robotics and Automation Letters (RA-L), 2023

268

29 May 2023

Interpretable Reward Redistribution in Reinforcement Learning: A Causal ApproachNeural Information Processing Systems (NeurIPS), 2023

Jun Wang

263

28 May 2023

Evolving Connectivity for Recurrent Spiking Neural NetworksNeural Information Processing Systems (NeurIPS), 2023

136

28 May 2023

On the Value of Myopic Behavior in Policy ReuseIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Zhen Wang

Xuelong Li

182

28 May 2023

Online Nonstochastic Model-Free Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

286

27 May 2023

Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL?

342

27 May 2023

Self-Supervised Reinforcement Learning that Transfers using Random FeaturesNeural Information Processing Systems (NeurIPS), 2023

Abhishek Gupta

247

26 May 2023

NASimEmu: Network Attack Simulator & Emulator for Training Agents Generalizing to Novel Scenarios

Jaromír Janisch

Tomávs Pevný

Viliam Lisý

192

26 May 2023

A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination ProblemAdaptive Agents and Multi-Agent Systems (AAMAS), 2023

231

26 May 2023

Inferring the Future by Imagining the PastNeural Information Processing Systems (NeurIPS), 2023

Kartik Chandra

Tony Chen

Tzu-Mao Li

Jonathan Ragan-Kelley

J. Tenenbaum

223

26 May 2023

IndustReal: Transferring Contact-Rich Assembly Tasks from Simulation to Reality

234

26 May 2023

A Hierarchical Approach to Population Training for Human-AI CollaborationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

Yi Loo

Chen Gong

Malika Meghjani

211

26 May 2023

A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents

Sukai Huang

Nir Lipovetzky

Trevor Cohn

266

26 May 2023

Physics-Regulated Deep Reinforcement Learning: Invariant EmbeddingsInternational Conference on Learning Representations (ICLR), 2023

233

26 May 2023

Emergent Agentic Transformer from Chain of Hindsight ExperienceInternational Conference on Machine Learning (ICML), 2023

Hao Liu

Pieter Abbeel

OffRL

247

26 May 2023

Counterfactual Explainer Framework for Deep Reinforcement Learning Models Using Policy Distillation

251

25 May 2023

Coarse-Tuning Models of Code with Reinforcement Learning Feedback

Abhinav C. P. Jain

Chima Adiole

Swarat Chaudhuri

Thomas W. Reps

Chris Jermaine Rice University

ALM

150

25 May 2023

Learning When to Ask for Help: Efficient Interactive Navigation via Implicit Uncertainty EstimationIEEE International Conference on Robotics and Automation (ICRA), 2023

Ifueko Igbinedion

S. Karaman

287

25 May 2023

Voyager: An Open-Ended Embodied Agent with Large Language Models

Linxi Fan

475

1,192

25 May 2023

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Pieter Abbeel

414

285

25 May 2023

Generating Synergistic Formulaic Alpha Collections via Reinforcement LearningKnowledge Discovery and Data Mining (KDD), 2023

224

25 May 2023

End-to-End Meta-Bayesian Optimisation with Transformer Neural ProcessesNeural Information Processing Systems (NeurIPS), 2023

444

25 May 2023

All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D SegmentationNeural Information Processing Systems (NeurIPS), 2023

Liyao Tang

264

25 May 2023

Lucy-SKG: Learning to Play Rocket League Efficiently Using Deep Reinforcement Learning

25 May 2023

PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

276

25 May 2023

Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustnessComputers and Chemical Engineering (Comput. Chem. Eng.), 2023

151

24 May 2023

Harnessing the Power of Large Language Models for Natural Language to First-Order Logic TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yuan Yang

Siheng Xiong

Ali Payani

Ehsan Shareghi

Faramarz Fekri

LRM

152

24 May 2023

The Larger They Are, the Harder They Fail: Language Models do not Recognize Identifier Swaps in PythonAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Antonio Valerio Miceli Barone

Fazl Barez

Ioannis Konstas

Shay B. Cohen

149

24 May 2023

Inverse Preference Learning: Preference-based RL without a Reward FunctionNeural Information Processing Systems (NeurIPS), 2023

Joey Hejna

Dorsa Sadigh

OffRL

300

24 May 2023

Decision-Aware Actor-Critic with Function Approximation and Theoretical GuaranteesNeural Information Processing Systems (NeurIPS), 2023

Nicolas Le Roux

322

24 May 2023

Neural Lyapunov and Optimal Control

Daniel Layeghi

Steve Tonneau

M. Mistry

218

24 May 2023