ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.06560
  4. Cited By
Deep Reinforcement Learning that Matters

Deep Reinforcement Learning that Matters

19 September 2017
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
D. Meger
    OffRL
ArXivPDFHTML

Papers citing "Deep Reinforcement Learning that Matters"

50 / 316 papers shown
Title
Generalization, Mayhems and Limits in Recurrent Proximal Policy
  Optimization
Generalization, Mayhems and Limits in Recurrent Proximal Policy Optimization
Marco Pleines
Matthias Pallasch
F. Zimmer
Mike Preuss
26
13
0
23 May 2022
Asking for Knowledge: Training RL Agents to Query External Knowledge
  Using Language
Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language
Iou-Jen Liu
Xingdi Yuan
Marc-Alexandre Côté
Pierre-Yves Oudeyer
A. Schwing
RALM
19
12
0
12 May 2022
Collaborative Target Search with a Visual Drone Swarm: An Adaptive
  Curriculum Embedded Multistage Reinforcement Learning Approach
Collaborative Target Search with a Visual Drone Swarm: An Adaptive Curriculum Embedded Multistage Reinforcement Learning Approach
Jiaping Xiao
Phumrapee Pisutsin
Mir Feroskhan
27
15
0
26 Apr 2022
Understanding and Preventing Capacity Loss in Reinforcement Learning
Understanding and Preventing Capacity Loss in Reinforcement Learning
Clare Lyle
Mark Rowland
Will Dabney
CLL
36
109
0
20 Apr 2022
deep-significance - Easy and Meaningful Statistical Significance Testing
  in the Age of Neural Networks
deep-significance - Easy and Meaningful Statistical Significance Testing in the Age of Neural Networks
Dennis Ulmer
Christian Hardmeier
J. Frellsen
48
42
0
14 Apr 2022
MetaMorph: Learning Universal Controllers with Transformers
MetaMorph: Learning Universal Controllers with Transformers
Agrim Gupta
Linxi Fan
Surya Ganguli
Li Fei-Fei
LM&Ro
11
86
0
22 Mar 2022
RB2: Robotic Manipulation Benchmarking with a Twist
RB2: Robotic Manipulation Benchmarking with a Twist
Sudeep Dasari
Jianren Wang
Joyce Hong
Shikhar Bahl
Yixin Lin
...
David Held
Lerrel Pinto
Deepak Pathak
Vikash Kumar
Abhi Gupta
26
27
0
15 Mar 2022
Auto-FedRL: Federated Hyperparameter Optimization for
  Multi-institutional Medical Image Segmentation
Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation
Pengfei Guo
Dong Yang
Ali Hatamizadeh
An Xu
Ziyue Xu
...
F. Patella
Elvira Stellato
G. Carrafiello
Vishal M. Patel
H. Roth
OOD
FedML
25
32
0
12 Mar 2022
Near-optimal Deep Reinforcement Learning Policies from Data for Zone
  Temperature Control
Near-optimal Deep Reinforcement Learning Policies from Data for Zone Temperature Control
L. D. Natale
B. Svetozarevic
Philipp Heer
Colin N. Jones
OffRL
AI4CE
24
6
0
10 Mar 2022
Fast and Data Efficient Reinforcement Learning from Pixels via
  Non-Parametric Value Approximation
Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation
Alex Long
Alan Blair
H. V. Hoof
23
3
0
07 Mar 2022
Addressing Randomness in Evaluation Protocols for Out-of-Distribution
  Detection
Addressing Randomness in Evaluation Protocols for Out-of-Distribution Detection
Konstantin Kirchheim
Tim Gonschorek
F. Ortmeier
OODD
28
2
0
01 Mar 2022
Machine Learning Empowered Intelligent Data Center Networking: A Survey
Machine Learning Empowered Intelligent Data Center Networking: A Survey
Bo-wen Li
Ting Wang
Peng Yang
Mingsong Chen
Shui Yu
Mounir Hamdi
AI4CE
16
4
0
28 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
36
9
0
23 Feb 2022
Myriad: a real-world testbed to bridge trajectory optimization and deep
  learning
Myriad: a real-world testbed to bridge trajectory optimization and deep learning
Nikolaus H. R. Howe
Simon Dufort-Labbé
Nitarshan Rajkumar
Pierre-Luc Bacon
32
5
0
22 Feb 2022
Sequential Bayesian experimental designs via reinforcement learning
Sequential Bayesian experimental designs via reinforcement learning
Hikaru Asano
OffRL
18
0
0
14 Feb 2022
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D
  Environments with Dynamic Obstacles
Autonomous Drone Swarm Navigation and Multi-target Tracking in 3D Environments with Dynamic Obstacles
Suleman Qamar
Dr. Saddam Hussain Khan
Muhammad Arif Arshad
Maryam Qamar
Asifullah Khan
23
16
0
13 Feb 2022
Uncovering Instabilities in Variational-Quantum Deep Q-Networks
Uncovering Instabilities in Variational-Quantum Deep Q-Networks
Maja Franz
Lucas Wolf
Maniraman Periyasamy
Christian Ufrecht
Daniel D. Scherer
Axel Plinge
Christopher Mutschler
Wolfgang Mauerer
26
29
0
10 Feb 2022
A Ranking Game for Imitation Learning
A Ranking Game for Imitation Learning
Harshit S. Sikchi
Akanksha Saran
Wonjoon Goo
S. Niekum
OffRL
22
22
0
07 Feb 2022
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm
  Configuration
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration
André Biedenkapp
Nguyen Dang
Martin S. Krejca
Frank Hutter
Carola Doerr
26
8
0
07 Feb 2022
Towards Training Reproducible Deep Learning Models
Towards Training Reproducible Deep Learning Models
Boyuan Chen
Mingzhi Wen
Yong Shi
Dayi Lin
Gopi Krishnan Rajbahadur
Zhen Ming
Z. Jiang
SyDa
15
37
0
04 Feb 2022
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
Aaron Mishkin
Arda Sahiner
Mert Pilanci
OffRL
77
30
0
02 Feb 2022
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement
  for Value Error
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
Scott Fujimoto
D. Meger
Doina Precup
Ofir Nachum
S. Gu
30
32
0
28 Jan 2022
Hyperparameter Tuning for Deep Reinforcement Learning Applications
Hyperparameter Tuning for Deep Reinforcement Learning Applications
M. Kiran
Melis Ozyildirim
34
22
0
26 Jan 2022
Reproducibility in Learning
Reproducibility in Learning
R. Impagliazzo
Rex Lei
T. Pitassi
Jessica Sorrell
24
43
0
20 Jan 2022
SmartDet: Context-Aware Dynamic Control of Edge Task Offloading for
  Mobile Object Detection
SmartDet: Context-Aware Dynamic Control of Edge Task Offloading for Mobile Object Detection
Davide Callegaro
Francesco Restuccia
Marco Levorato
21
3
0
11 Jan 2022
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
Jack Parker-Holder
Raghunandan Rajan
Xingyou Song
André Biedenkapp
Yingjie Miao
...
Vu-Linh Nguyen
Roberto Calandra
Aleksandra Faust
Frank Hutter
Marius Lindauer
AI4CE
33
100
0
11 Jan 2022
Mirror Learning: A Unifying Framework of Policy Optimisation
Mirror Learning: A Unifying Framework of Policy Optimisation
J. Kuba
Christian Schroeder de Witt
Jakob N. Foerster
23
24
0
07 Jan 2022
Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
Vincent Mai
Kaustubh Mani
Liam Paull
36
34
0
05 Jan 2022
Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal
  Difference and Successor Representation
Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation
Mohammad Salimibeni
Arash Mohammadi
Parvin Malekzadeh
Konstantinos N. Plataniotis
18
5
0
30 Dec 2021
Parallelized and Randomized Adversarial Imitation Learning for
  Safety-Critical Self-Driving Vehicles
Parallelized and Randomized Adversarial Imitation Learning for Safety-Critical Self-Driving Vehicles
Won Joon Yun
Myungjae Shin
Soyi Jung
S. Kwon
Joongheon Kim
22
5
0
26 Dec 2021
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation
Enrico Marchesini
Davide Corsi
Alessandro Farinelli
16
18
0
16 Dec 2021
CoMPS: Continual Meta Policy Search
CoMPS: Continual Meta Policy Search
Glen Berseth
Zhiwei Zhang
Grace Zhang
Chelsea Finn
Sergey Levine
CLL
OffRL
28
16
0
08 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
24
4
0
29 Nov 2021
Aggressive Q-Learning with Ensembles: Achieving Both High Sample
  Efficiency and High Asymptotic Performance
Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Yanqiu Wu
Xinyue Chen
Che Wang
Yiming Zhang
Keith Ross
OffRL
9
9
0
17 Nov 2021
RLOps: Development Life-cycle of Reinforcement Learning Aided Open RAN
RLOps: Development Life-cycle of Reinforcement Learning Aided Open RAN
Peizheng Li
Jonathan D. Thomas
Xiaoyang Wang
Ahmed Khalil
A. Ahmad
...
S. Kapoor
Arjun Parekh
A. Doufexi
Arman Shojaeifard
Robert Piechocki
AI4TS
14
37
0
12 Nov 2021
d3rlpy: An Offline Deep Reinforcement Learning Library
d3rlpy: An Offline Deep Reinforcement Learning Library
Takuma Seno
M. Imai
OffRL
GP
60
100
0
06 Nov 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
40
93
0
04 Nov 2021
Validate on Sim, Detect on Real -- Model Selection for Domain
  Randomization
Validate on Sim, Detect on Real -- Model Selection for Domain Randomization
Gal Leibovich
Guy Jacob
Shadi Endrawis
Gal Novik
Aviv Tamar
24
7
0
01 Nov 2021
Generalized Proximal Policy Optimization with Sample Reuse
Generalized Proximal Policy Optimization with Sample Reuse
James Queeney
I. Paschalidis
Christos G. Cassandras
OffRL
24
46
0
29 Oct 2021
GrowSpace: Learning How to Shape Plants
GrowSpace: Learning How to Shape Plants
Yasmeen Hitti
Ionelia Buzatu
Manuel Del Verme
M. Lefsrud
Florian Golemo
A. Durand
19
2
0
15 Oct 2021
CT-SGAN: Computed Tomography Synthesis GAN
CT-SGAN: Computed Tomography Synthesis GAN
Ahmad Pesaranghader
Yiping Wang
Mohammad Havaei
GAN
MedIm
27
11
0
14 Oct 2021
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise
  Datasets
A Closer Look at Advantage-Filtered Behavioral Cloning in High-Noise Datasets
J. E. Grigsby
Yanjun Qi
OffRL
21
5
0
10 Oct 2021
CLEVA-Compass: A Continual Learning EValuation Assessment Compass to
  Promote Research Transparency and Comparability
CLEVA-Compass: A Continual Learning EValuation Assessment Compass to Promote Research Transparency and Comparability
Martin Mundt
Steven Braun
Quentin Delfosse
Kristian Kersting
27
35
0
07 Oct 2021
Offline RL With Resource Constrained Online Deployment
Offline RL With Resource Constrained Online Deployment
Jayanth Reddy Regatti
A. Deshmukh
Frank Cheng
Young Hun Jung
Abhishek Gupta
Ürün Dogan
OffRL
13
2
0
07 Oct 2021
On The Transferability of Deep-Q Networks
On The Transferability of Deep-Q Networks
M. Sabatelli
Pierre Geurts
31
2
0
06 Oct 2021
Collective eXplainable AI: Explaining Cooperative Strategies and Agent
  Contribution in Multiagent Reinforcement Learning with Shapley Values
Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning with Shapley Values
Alexandre Heuillet
Fabien Couthouis
Natalia Díaz Rodríguez
19
57
0
04 Oct 2021
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning
  Research
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
Mikayel Samvelyan
Robert Kirk
Vitaly Kurin
Jack Parker-Holder
Minqi Jiang
Eric Hambro
Fabio Petroni
Heinrich Küttler
Edward Grefenstette
Tim Rocktaschel
OffRL
238
89
0
27 Sep 2021
On Bonus-Based Exploration Methods in the Arcade Learning Environment
On Bonus-Based Exploration Methods in the Arcade Learning Environment
Adrien Ali Taïga
W. Fedus
Marlos C. Machado
Aaron Courville
Marc G. Bellemare
16
58
0
22 Sep 2021
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep
  Reinforcement Learning
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning
Qiang He
Yuxun Qu
Chen Gong
Xinwen Hou
OffRL
16
10
0
22 Sep 2021
Membership Inference Attacks Against Temporally Correlated Data in Deep
  Reinforcement Learning
Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning
Maziar Gomrokchi
Susan Amin
Hossein Aboutalebi
Alexander Wong
Doina Precup
MIACV
AAML
42
3
0
08 Sep 2021
Previous
1234567
Next