Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.01626
Cited By
Combining policy gradient and Q-learning
5 November 2016
Brendan O'Donoghue
Rémi Munos
Koray Kavukcuoglu
Volodymyr Mnih
OffRL
OnRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Combining policy gradient and Q-learning"
50 / 90 papers shown
Title
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
Chen-Hao Chao
Chien Feng
Wei-Fang Sun
Cheng-Kuang Lee
Simon See
Chun-Yi Lee
41
1
0
22 May 2024
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
Nicholas Corrado
Josiah P. Hanna
OffRL
20
1
0
14 Nov 2023
Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Daniel Jarrett
Alihan Huyuk
M. Schaar
AI4CE
22
27
0
28 Oct 2023
Soft Decomposed Policy-Critic: Bridging the Gap for Effective Continuous Control with Discrete RL
Ye Zhang
Jian Sun
G. Wang
Zhuoxian Li
Wei Chen
OffRL
21
0
0
20 Aug 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
33
0
0
22 Mar 2023
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
Brendan O'Donoghue
OffRL
35
6
0
18 Feb 2023
Distillation Policy Optimization
Jianfei Ma
OffRL
26
1
0
01 Feb 2023
Extending Open Bandit Pipeline to Simulate Industry Challenges
Bram van den Akker
N. Weber
Felipe Moraes
Dmitri Goldenberg
OffRL
18
1
0
09 Sep 2022
A Parametric Class of Approximate Gradient Updates for Policy Optimization
Ramki Gummadi
Saurabh Kumar
Junfeng Wen
Dale Schuurmans
26
0
0
17 Jun 2022
Reinforcement Learning for Navigation of Mobile Robot with LiDAR
Inhwan Kim
S. Nengroo
Dongsoo Har
27
13
0
06 Dec 2021
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
30
9
0
24 Nov 2021
Generalized Proximal Policy Optimization with Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
34
47
0
29 Oct 2021
Variational Bayesian Optimistic Sampling
Brendan O'Donoghue
Tor Lattimore
9
6
0
29 Oct 2021
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Alan Chan
Hugo Silva
Sungsu Lim
Tadashi Kozuno
A. R. Mahmood
Martha White
25
29
0
17 Jul 2021
A Max-Min Entropy Framework for Reinforcement Learning
Seungyul Han
Y. Sung
30
20
0
19 Jun 2021
Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning
Jie Ren
Yewen Li
Zihan Ding
Wei Pan
Hao Dong
BDL
MoE
21
25
0
19 Apr 2021
A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control
Zahra Gharaee
Karl Holmquist
Linbo He
M. Felsberg
BDL
17
4
0
08 Apr 2021
A Survey on Deep Reinforcement Learning for Audio-Based Applications
S. Latif
Heriberto Cuayáhuitl
Farrukh Pervez
Fahad Shamshad
Hafiz Shehbaz Ali
Min Zhang
OffRL
47
73
0
01 Jan 2021
Reinforcement Learning for Robust Missile Autopilot Design
Bernardo Cortez
11
2
0
26 Nov 2020
Weighted Entropy Modification for Soft Actor-Critic
Yizhou Zhao
Song-Chun Zhu
18
0
0
18 Nov 2020
Average-reward model-free reinforcement learning: a systematic review and literature mapping
Vektor Dewanto
George Dunn
A. Eshragh
M. Gallagher
Fred Roosta
14
28
0
18 Oct 2020
Energy-based Surprise Minimization for Multi-Agent Value Factorization
Karush Suri
Xiaolong Shi
Konstantinos Plataniotis
Y. Lawryshyn
18
1
0
16 Sep 2020
Visualizing the Loss Landscape of Actor Critic Methods with Applications in Inventory Optimization
Recep Yusuf Bekci
M. Gümüş
12
4
0
04 Sep 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
Jean-Bastien Grill
Florent Altché
Yunhao Tang
Thomas Hubert
Michal Valko
Ioannis Antonoglou
Rémi Munos
27
73
0
24 Jul 2020
Matrix games with bandit feedback
Brendan O'Donoghue
Tor Lattimore
Ian Osband
8
8
0
09 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
129
11
0
08 Jun 2020
Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
Seungyul Han
Y. Sung
8
24
0
02 Jun 2020
Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey
Ammar Haydari
Y. Yilmaz
AI4TS
22
453
0
02 May 2020
First return, then explore
Adrien Ecoffet
Joost Huizinga
Joel Lehman
Kenneth O. Stanley
Jeff Clune
47
350
0
27 Apr 2020
Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics
Amir H. Mosavi
Pedram Ghamisi
Yaser Faghan
Puhong Duan
OffRL
19
152
0
21 Mar 2020
Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework
Ngoc Duy Nguyen
Thanh Thi Nguyen
Hai V. Nguyen
Doug Creighton
S. Nahavandi
38
3
0
27 Feb 2020
BRPO: Batch Residual Policy Optimization
Kentaro Kanamori
Yinlam Chow
Takuya Takagi
Hiroki Arimura
Honglak Lee
Ken Kobayashi
Craig Boutilier
OffRL
141
46
0
08 Feb 2020
Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors
Jingliang Duan
Yang Guan
Shengbo Eben Li
Yangang Ren
B. Cheng
OffRL
22
174
0
09 Jan 2020
Making Sense of Reinforcement Learning and Probabilistic Inference
Brendan O'Donoghue
Ian Osband
Catalin Ionescu
OffRL
27
48
0
03 Jan 2020
A Survey of Deep Reinforcement Learning in Video Games
Kun Shao
Zhentao Tang
Yuanheng Zhu
Nannan Li
Dongbin Zhao
OffRL
AI4TS
43
188
0
23 Dec 2019
Direct and indirect reinforcement learning
Yang Guan
Shengbo Eben Li
Jingliang Duan
Jie Li
Yangang Ren
Qi Sun
B. Cheng
OffRL
38
34
0
23 Dec 2019
Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Gang Chen
25
4
0
24 Nov 2019
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
Shangtong Zhang
Bo Liu
Hengshuai Yao
Shimon Whiteson
OffRL
23
8
0
11 Nov 2019
Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning
Wenjie Shi
Shiji Song
Cheng Wu
17
36
0
07 Sep 2019
A Convergence Result for Regularized Actor-Critic Methods
Wesley A Suttle
Zhuoran Yang
Kaipeng Zhang
Ji Liu
11
0
0
13 Jul 2019
Ranking Policy Gradient
Kaixiang Lin
Jiayu Zhou
OffRL
11
7
0
24 Jun 2019
Epistemic Risk-Sensitive Reinforcement Learning
Hannes Eriksson
Christos Dimitrakakis
21
29
0
14 Jun 2019
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
Kaipeng Zhang
Zhuoran Yang
Tamer Basar
27
125
0
31 May 2019
P3O: Policy-on Policy-off Policy Optimization
Rasool Fakoor
Pratik Chaudhari
Alex Smola
OffRL
26
51
0
05 May 2019
Similarities between policy gradient methods (PGM) in Reinforcement learning (RL) and supervised learning (SL)
Eric Benhamou
OffRL
16
1
0
12 Apr 2019
Generalized Off-Policy Actor-Critic
Shangtong Zhang
Wendelin Bohmer
Shimon Whiteson
OffRL
CML
16
43
0
27 Mar 2019
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus
Hongyu Gong
S. Bhat
Lingfei Wu
Jinjun Xiong
Wen-mei W. Hwu
OffRL
34
93
0
26 Mar 2019
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Denis Steckelmacher
Hélène Plisnier
D. Roijers
A. Nowé
OffRL
23
17
0
11 Mar 2019
Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning
Gang Chen
Yiming Peng
14
8
0
14 Feb 2019
A Bandit Framework for Optimal Selection of Reinforcement Learning Agents
A. Merentitis
Kashif Rasul
Roland Vollgraf
Abdul-Saboor Sheikh
Urs M. Bergmann
16
2
0
10 Feb 2019
1
2
Next