Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1707.06347
Cited By
v1
v2 (latest)
Proximal Policy Optimization Algorithms
20 July 2017
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Proximal Policy Optimization Algorithms"
50 / 11,421 papers shown
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Neural Information Processing Systems (NeurIPS), 2023
Alexandre Ramé
Guillaume Couairon
Mustafa Shukor
Corentin Dancette
Jean-Baptiste Gaya
Laure Soulier
Matthieu Cord
MoMe
360
202
0
07 Jun 2023
Dual policy as self-model for planning
Journal of Korean institute of intelligent systems (JKIIS), 2023
J. Yoo
Fernanda De La Torre
G. R. Yang
167
1
0
07 Jun 2023
Balancing of competitive two-player Game Levels with Reinforcement Learning
Florian Rupp
Manuel Eberhardinger
Kai Eckert
156
9
0
07 Jun 2023
Fairness-Sensitive Policy-Gradient Reinforcement Learning for Reducing Bias in Robotic Assistance
Jie Zhu
Mengsha Hu
Xueyao Liang
Amy Zhang
Ruoming Jin
Rui Liu
166
1
0
07 Jun 2023
Adaptive Frequency Green Light Optimal Speed Advisory based on Hybrid Actor-Critic Reinforcement Learning
Mingle Xu
Dongyu Zuo
66
2
0
07 Jun 2023
Learning with a Mole: Transferable latent spatial representations for navigation without reconstruction
International Conference on Learning Representations (ICLR), 2023
G. Bono
L. Antsfeld
Assem Sadek
G. Monaci
Christian Wolf
SSL
315
8
0
06 Jun 2023
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach
Bin-Bin Hu
Chenyang Zhao
Pushi Zhang
Zihao Zhou
Yuanhang Yang
Zenglin Xu
Yinan Han
LM&Ro
LLMAG
605
32
0
06 Jun 2023
State Regularized Policy Optimization on Data with Dynamics Shift
Neural Information Processing Systems (NeurIPS), 2023
Zhenghai Xue
Qingpeng Cai
Shuchang Liu
Dong Zheng
Peng Jiang
Kun Gai
Bo An
OffRL
369
25
0
06 Jun 2023
RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control
Jonas Eschmann
Dario Albani
Giuseppe Loianno
OffRL
364
7
0
06 Jun 2023
A Grasp Pose is All You Need: Learning Multi-fingered Grasping with Deep Reinforcement Learning from Vision and Touch
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Federico Ceola
Elisa Maiettini
Lorenzo Rosasco
Lorenzo Natale
219
6
0
06 Jun 2023
Learning Embeddings for Sequential Tasks Using Population of Agents
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Mridul Mahajan
Georgios Tzannetos
Goran Radanović
Adish Singla
FedML
262
1
0
05 Jun 2023
Explore to Generalize in Zero-Shot RL
Neural Information Processing Systems (NeurIPS), 2023
E. Zisselman
Itai Lavie
Daniel Soudry
Aviv Tamar
323
20
0
05 Jun 2023
Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination
Journal of Artificial Intelligence Research (JAIR), 2023
Yang Li
Shao Zhang
Jichen Sun
Wenhao Zhang
Yali Du
Ying Wen
Xinbing Wang
Wei Pan
295
24
0
05 Jun 2023
Action-Evolution Petri Nets: a Framework for Modeling and Solving Dynamic Task Assignment Problems
International Conference on Business Process Management (BPM), 2023
R. Bianco
R. Dijkman
Wim P. M. Nuijten
W. Jaarsveld
127
7
0
05 Jun 2023
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
International Conference on Machine Learning (ICML), 2023
Tianying Ji
Yuping Luo
Gang Hua
Xianyuan Zhan
Jianwei Zhang
Huazhe Xu
OffRL
OnRL
408
21
0
05 Jun 2023
Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
International Conference on Machine Learning (ICML), 2023
Wanpeng Zhang
Yilin Li
Boyu Yang
Zongqing Lu
CML
281
3
0
05 Jun 2023
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Scott Fujimoto
Wei-Di Chang
Edward James Smith
S. Gu
Doina Precup
David Meger
OffRL
357
85
0
04 Jun 2023
Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL
Miguel Suau
M. Spaan
F. Oliehoek
CML
315
9
0
04 Jun 2023
ContraBAR: Contrastive Bayes-Adaptive Deep RL
International Conference on Machine Learning (ICML), 2023
Era Choshen
Aviv Tamar
BDL
OffRL
183
10
0
04 Jun 2023
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Banghua Zhu
Hiteshi Sharma
Felipe Vieira Frujeri
Shi Dong
Chenguang Zhu
Michael I. Jordan
Jiantao Jiao
OSLM
280
48
0
04 Jun 2023
Cycle Consistency Driven Object Discovery
International Conference on Learning Representations (ICLR), 2023
Aniket Didolkar
Anirudh Goyal
Yoshua Bengio
OCL
343
10
0
03 Jun 2023
MA2CL:Masked Attentive Contrastive Learning for Multi-Agent Reinforcement Learning
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Haolin Song
Ming Feng
Wen-gang Zhou
Houqiang Li
OffRL
162
11
0
03 Jun 2023
Synaptic motor adaptation: A three-factor learning rule for adaptive robotic control in spiking neural networks
International Conference on Systems (ICONS), 2023
Samuel Schmidgall
Joe Hays
245
6
0
02 Jun 2023
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
International Conference on Machine Learning (ICML), 2023
Brahma S. Pavse
M. Zurek
Yudong Chen
Qiaomin Xie
Josiah P. Hanna
OffRL
360
2
0
02 Jun 2023
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
International Conference on Machine Learning (ICML), 2023
Anas Barakat
Ilyas Fatkhullin
Niao He
219
14
0
02 Jun 2023
PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward
Weichao Zhou
Wenchao Li
275
0
0
02 Jun 2023
OMNI: Open-endedness via Models of human Notions of Interestingness
Jenny Zhang
Joel Lehman
Kenneth O. Stanley
Jeff Clune
LRM
439
52
0
02 Jun 2023
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Neural Information Processing Systems (NeurIPS), 2023
Zeqiu Wu
Yushi Hu
Weijia Shi
Nouha Dziri
Alane Suhr
Prithviraj Ammanabrolu
Noah A. Smith
Mari Ostendorf
Hannaneh Hajishirzi
ALM
465
417
0
02 Jun 2023
EmoUS: Simulating User Emotions in Task-Oriented Dialogues
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Hsien-chin Lin
Shutong Feng
Christian Geishauser
Nurul Lubis
Carel van Niekerk
Michael Heck
Benjamin Ruppik
Renato Vukovic
Milica Gavsić
122
15
0
02 Jun 2023
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
International Conference on Machine Learning (ICML), 2023
Andrew Jesson
Chris Xiaoxuan Lu
Gunshi Gupta
Angelos Filos
Jakob N. Foerster
Y. Gal
OffRL
361
9
0
02 Jun 2023
Deep Q-Learning versus Proximal Policy Optimization: Performance Comparison in a Material Sorting Task
International Symposium on Industrial Electronics (ISIE), 2023
Reuf Kozlica
S. Wegenkittl
Simon Hirlaender
OffRL
119
13
0
02 Jun 2023
Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction
Neural Information Processing Systems (NeurIPS), 2023
Quentin Delfosse
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
330
53
0
02 Jun 2023
ChatGPT for Zero-shot Dialogue State Tracking: A Solution or an Opportunity?
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Michael Heck
Nurul Lubis
Benjamin Ruppik
Renato Vukovic
Shutong Feng
Christian Geishauser
Hsien-chin Lin
Carel van Niekerk
Milica Gavsić
215
54
0
02 Jun 2023
Hyperparameters in Reinforcement Learning and How To Tune Them
International Conference on Machine Learning (ICML), 2023
Theresa Eimer
Marius Lindauer
Roberta Raileanu
OffRL
425
71
0
02 Jun 2023
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
International Conference on Machine Learning (ICML), 2023
Hyeon-Seob Kim
Minsu Kim
SungSoo Ahn
Jinkyoo Park
OffRL
443
9
0
02 Jun 2023
Heterogeneous Knowledge for Augmented Modular Reinforcement Learning
Lorenz Wolf
Mirco Musolesi
OffRL
233
0
0
01 Jun 2023
Investigating Navigation Strategies in the Morris Water Maze through Deep Reinforcement Learning
Neural Networks (Neural Netw.), 2023
A. Liu
Alla Borisyuk
277
12
0
01 Jun 2023
Extracting Reward Functions from Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Felipe Nuti
Tim Franzmeyer
João F. Henriques
198
19
0
01 Jun 2023
Chaos persists in large-scale multi-agent learning despite adaptive learning rates
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Lampros Flokas
Georgios Piliouras
245
1
0
01 Jun 2023
Normalization Enhances Generalization in Visual Reinforcement Learning
Adaptive Agents and Multi-Agent Systems (AAMAS), 2023
Lu Li
Jiafei Lyu
Guozheng Ma
Zilin Wang
Zhen Yang
Xiu Li
Zhiheng Li
OOD
202
12
0
01 Jun 2023
TorchRL: A data-driven decision-making library for PyTorch
International Conference on Learning Representations (ICLR), 2023
Albert Bou
Matteo Bettini
Sebastian Dittert
Vikash Kumar
Shagun Sodhani
Xiaomeng Yang
Gianni De Fabritiis
Vincent Moens
OffRL
AI4CE
309
65
0
01 Jun 2023
Interactive Character Control with Auto-Regressive Motion Diffusion Models
ACM Transactions on Graphics (TOG), 2023
Yi Shi
Jingbo Wang
Xuekun Jiang
Bingkun Lin
Bo Dai
Xue Bin Peng
DiffM
AI4CE
314
41
0
01 Jun 2023
CapText: Large Language Model-based Caption Generation From Image Context and Description
Shinjini Ghosh
Sagnik Anupam
VLM
321
4
0
01 Jun 2023
From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Neural Information Processing Systems (NeurIPS), 2023
Peter Shaw
Mandar Joshi
James Cohan
Jonathan Berant
Panupong Pasupat
Hexiang Hu
Urvashi Khandelwal
Kenton Lee
Kristina Toutanova
LLMAG
LM&Ro
263
75
0
31 May 2023
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Paul Roit
Johan Ferret
Lior Shani
Roee Aharoni
Geoffrey Cideron
...
Olivier Bachem
G. Elidan
Avinatan Hassidim
Olivier Pietquin
Idan Szpektor
HILM
289
100
0
31 May 2023
Adaptive Coordination in Social Embodied Rearrangement
International Conference on Machine Learning (ICML), 2023
Andrew Szot
Unnat Jain
Dhruv Batra
Z. Kira
Ruta Desai
Akshara Rai
221
18
0
31 May 2023
Efficient Diffusion Policies for Offline Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Bingyi Kang
Xiao Ma
Chao Du
Tianyu Pang
Shuicheng Yan
OffRL
354
117
0
31 May 2023
Latent Exploration for Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
A. Chiappa
Alessandro Marin Vargas
Ann Zixiang Huang
Alexander Mathis
321
27
0
31 May 2023
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency
Mayank Agarwal
Ramón Fernández Astudillo
Tahira Naseem
Subhajit Chaudhury
Pavan Kapanipathi
Salim Roukos
Alexander G. Gray
OffRL
184
0
0
31 May 2023
Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability
R. T. Lange
Henning Sprekeler
178
2
0
31 May 2023
Previous
1
2
3
...
137
138
139
...
227
228
229
Next