ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.09801
  4. Cited By
Meta-Gradient Reinforcement Learning

Meta-Gradient Reinforcement Learning

24 May 2018
Zhongwen Xu
H. V. Hasselt
David Silver
ArXivPDFHTML

Papers citing "Meta-Gradient Reinforcement Learning"

50 / 203 papers shown
Title
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
29
38
0
22 Apr 2020
Meta-Learning in Neural Networks: A Survey
Meta-Learning in Neural Networks: A Survey
Timothy M. Hospedales
Antreas Antoniou
P. Micaelli
Amos Storkey
OOD
61
1,935
0
11 Apr 2020
Online Meta-Learning for Multi-Source and Semi-Supervised Domain
  Adaptation
Online Meta-Learning for Multi-Source and Semi-Supervised Domain Adaptation
Da Li
Timothy M. Hospedales
22
102
0
09 Apr 2020
Work in Progress: Temporally Extended Auxiliary Tasks
Work in Progress: Temporally Extended Auxiliary Tasks
Craig Sherstan
Bilal Kartal
Pablo Hernandez-Leal
Matthew E. Taylor
10
1
0
01 Apr 2020
Agent57: Outperforming the Atari Human Benchmark
Agent57: Outperforming the Atari Human Benchmark
Adria Puigdomenech Badia
Bilal Piot
Steven Kapturowski
Pablo Sprechmann
Alex Vitvitskyi
Daniel Guo
Charles Blundell
OffRL
18
509
0
30 Mar 2020
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Wei Zhou
Yiying Li
Yongxin Yang
Huaimin Wang
Timothy M. Hospedales
OffRL
30
46
0
11 Mar 2020
Meta-learning curiosity algorithms
Meta-learning curiosity algorithms
Ferran Alet
Martin Schneider
Tomas Lozano-Perez
L. Kaelbling
25
63
0
11 Mar 2020
Finding online neural update rules by learning to remember
Finding online neural update rules by learning to remember
Karol Gregor
CLL
39
268
0
06 Mar 2020
A Self-Tuning Actor-Critic Algorithm
A Self-Tuning Actor-Critic Algorithm
Tom Zahavy
Zhongwen Xu
Vivek Veeriah
Matteo Hessel
Junhyuk Oh
H. V. Hasselt
David Silver
Satinder Singh
18
13
0
28 Feb 2020
Never Give Up: Learning Directed Exploration Strategies
Never Give Up: Learning Directed Exploration Strategies
Adria Puigdomenech Badia
Pablo Sprechmann
Alex Vitvitskyi
Daniel Guo
Bilal Piot
...
O. Tieleman
Martín Arjovsky
Alexander Pritzel
Andew Bolt
Charles Blundell
23
290
0
14 Feb 2020
HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning
  Problem
HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem
Yun Hua
Xiangfeng Wang
Bo Jin
Wenhao Li
Junchi Yan
Xiaofeng He
H. Zha
OffRL
13
9
0
11 Feb 2020
Reward Tweaking: Maximizing the Total Reward While Planning for Short
  Horizons
Reward Tweaking: Maximizing the Total Reward While Planning for Short Horizons
Chen Tessler
Shie Mannor
25
2
0
09 Feb 2020
Deep Radial-Basis Value Functions for Continuous Control
Deep Radial-Basis Value Functions for Continuous Control
Kavosh Asadi
Neev Parikh
Ronald E. Parr
George Konidaris
Michael L. Littman
15
4
0
05 Feb 2020
Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network
  Compilation
Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation
Byung Hoon Ahn
Prannoy Pilligundla
Amir Yazdanbakhsh
H. Esmaeilzadeh
ODL
61
80
0
23 Jan 2020
How Should an Agent Practice?
How Should an Agent Practice?
Janarthanan Rajendran
Richard L. Lewis
Vivek Veeriah
Honglak Lee
Satinder Singh
26
9
0
15 Dec 2019
Adapting Behaviour for Learning Progress
Adapting Behaviour for Learning Progress
Tom Schaul
Diana Borsa
David Ding
David Szepesvari
Georg Ostrovski
Will Dabney
Simon Osindero
14
18
0
14 Dec 2019
What Can Learned Intrinsic Rewards Capture?
What Can Learned Intrinsic Rewards Capture?
Zeyu Zheng
Junhyuk Oh
Matteo Hessel
Zhongwen Xu
M. Kroiss
H. V. Hasselt
David Silver
Satinder Singh
23
77
0
11 Dec 2019
BADGER: Learning to (Learn [Learning Algorithms] through Multi-Agent
  Communication)
BADGER: Learning to (Learn [Learning Algorithms] through Multi-Agent Communication)
Marek Rosa
O. Afanasjeva
Simon Andersson
Joseph Davidson
N. Guttenberg
Petr Hlubucek
Martin Poliak
Jaroslav Vítků
Jan Feyereisl
22
10
0
03 Dec 2019
Gamma-Nets: Generalizing Value Estimation over Timescale
Gamma-Nets: Generalizing Value Estimation over Timescale
Craig Sherstan
Shibhansh Dohare
J. MacGlashan
J. Günther
P. Pilarski
19
12
0
18 Nov 2019
Context-aware Active Multi-Step Reinforcement Learning
Context-aware Active Multi-Step Reinforcement Learning
Gang Chen
Dingcheng Li
Ran Xu
6
0
0
11 Nov 2019
When MAML Can Adapt Fast and How to Assist When It Cannot
When MAML Can Adapt Fast and How to Assist When It Cannot
Sébastien M. R. Arnold
Shariq Iqbal
Fei Sha
17
5
0
30 Oct 2019
Meta Matrix Factorization for Federated Rating Predictions
Meta Matrix Factorization for Federated Rating Predictions
Yujie Lin
Pengjie Ren
Zhumin Chen
Z. Ren
Dongxiao Yu
Jun Ma
Maarten de Rijke
Xiuzhen Cheng
20
115
0
22 Oct 2019
Improving Generalization in Meta Reinforcement Learning using Learned
  Objectives
Improving Generalization in Meta Reinforcement Learning using Learned Objectives
Louis Kirsch
Sjoerd van Steenkiste
Jürgen Schmidhuber
OffRL
14
118
0
09 Oct 2019
Off-Policy Actor-Critic with Shared Experience Replay
Off-Policy Actor-Critic with Shared Experience Replay
Simon Schmitt
Matteo Hessel
Karen Simonyan
OffRL
27
68
0
25 Sep 2019
MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning
MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning
Raghunandan Rajan
Jessica Lizeth Borja Diaz
Suresh Guttikonda
Fabio Ferreira
André Biedenkapp
Jan Ole von Hartz
Frank Hutter
33
3
0
17 Sep 2019
Discovery of Useful Questions as Auxiliary Tasks
Discovery of Useful Questions as Auxiliary Tasks
Vivek Veeriah
Matteo Hessel
Zhongwen Xu
Richard L. Lewis
Janarthanan Rajendran
Junhyuk Oh
H. V. Hasselt
David Silver
Satinder Singh
LLMAG
14
86
0
10 Sep 2019
Parameterized Exploration
Parameterized Exploration
Jesse Clifton
Lili Wu
E. Laber
34
0
0
13 Jul 2019
General non-linear Bellman equations
General non-linear Bellman equations
H. V. Hasselt
John Quan
Matteo Hessel
Zhongwen Xu
Diana Borsa
André Barreto
16
14
0
08 Jul 2019
On Inductive Biases in Deep Reinforcement Learning
On Inductive Biases in Deep Reinforcement Learning
Matteo Hessel
H. V. Hasselt
Joseph Modayil
David Silver
AI4CE
25
41
0
05 Jul 2019
Hyp-RL : Hyperparameter Optimization by Reinforcement Learning
Hyp-RL : Hyperparameter Optimization by Reinforcement Learning
H. Jomaa
Josif Grabocka
Lars Schmidt-Thieme
25
65
0
27 Jun 2019
Experience Replay Optimization
Experience Replay Optimization
Daochen Zha
Kwei-Herng Lai
Kaixiong Zhou
Xia Hu
OffRL
11
102
0
19 Jun 2019
Meta-Learning via Learned Loss
Meta-Learning via Learned Loss
Sarah Bechtle
Artem Molchanov
Yevgen Chebotar
Edward Grefenstette
Ludovic Righetti
Gaurav Sukhatme
Franziska Meier
20
110
0
12 Jun 2019
Lifelong Learning with a Changing Action Set
Lifelong Learning with a Changing Action Set
Yash Chandak
Georgios Theocharous
Chris Nota
Philip S. Thomas
CLL
OffRL
11
29
0
05 Jun 2019
Reinforcement Learning and Adaptive Sampling for Optimized DNN
  Compilation
Reinforcement Learning and Adaptive Sampling for Optimized DNN Compilation
Byung Hoon Ahn
Prannoy Pilligundla
H. Esmaeilzadeh
13
20
0
30 May 2019
Beyond Exponentially Discounted Sum: Automatic Learning of Return
  Function
Beyond Exponentially Discounted Sum: Automatic Learning of Return Function
Yufei Wang
Qiwei Ye
Tie-Yan Liu
OffRL
9
15
0
28 May 2019
Learning Efficient and Effective Exploration Policies with
  Counterfactual Meta Policy
Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy
Ruihan Yang
Qiwei Ye
Tie-Yan Liu
20
0
0
28 May 2019
Meta Reinforcement Learning with Task Embedding and Shared Policy
Meta Reinforcement Learning with Task Embedding and Shared Policy
Lin Lan
Zhenguo Li
X. Guan
P. Wang
OffRL
8
50
0
16 May 2019
Meta-learning of Sequential Strategies
Meta-learning of Sequential Strategies
Pedro A. Ortega
Jane X. Wang
Mark Rowland
Tim Genewein
Z. Kurth-Nelson
...
Yee Whye Teh
H. V. Hasselt
Nando de Freitas
M. Botvinick
Shane Legg
OffRL
25
96
0
08 May 2019
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic
  Context Variables
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
Kate Rakelly
Aurick Zhou
Deirdre Quillen
Chelsea Finn
Sergey Levine
OffRL
36
647
0
19 Mar 2019
Learning Feature Relevance Through Step Size Adaptation in
  Temporal-Difference Learning
Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning
Alex Kearney
Vivek Veeriah
Jaden B. Travnik
P. Pilarski
R. Sutton
OOD
9
13
0
08 Mar 2019
Learning to Generalize from Sparse and Underspecified Rewards
Learning to Generalize from Sparse and Underspecified Rewards
Rishabh Agarwal
Chen Liang
Dale Schuurmans
Mohammad Norouzi
OffRL
49
97
0
19 Feb 2019
Hyperbolic Discounting and Learning over Multiple Horizons
Hyperbolic Discounting and Learning over Multiple Horizons
W. Fedus
Carles Gelada
Yoshua Bengio
Marc G. Bellemare
Hugo Larochelle
29
105
0
19 Feb 2019
Fast Efficient Hyperparameter Tuning for Policy Gradients
Fast Efficient Hyperparameter Tuning for Policy Gradients
Supratik Paul
Vitaly Kurin
Shimon Whiteson
22
32
0
18 Feb 2019
Separating value functions across time-scales
Separating value functions across time-scales
Joshua Romoff
Peter Henderson
Ahmed Touati
Emma Brunskill
Joelle Pineau
Yann Ollivier
17
25
0
05 Feb 2019
Feature-Critic Networks for Heterogeneous Domain Generalization
Feature-Critic Networks for Heterogeneous Domain Generalization
Yiying Li
Yongxin Yang
Wei Zhou
Timothy M. Hospedales
OOD
25
253
0
31 Jan 2019
SNAS: Stochastic Neural Architecture Search
SNAS: Stochastic Neural Architecture Search
Sirui Xie
Hehui Zheng
Chunxiao Liu
Liang Lin
11
931
0
24 Dec 2018
Continual Match Based Training in Pommerman: Technical Report
Continual Match Based Training in Pommerman: Technical Report
Peng Peng
Liang Pang
Yufeng Yuan
Chao Gao
CLL
6
12
0
18 Dec 2018
ProMP: Proximal Meta-Policy Search
ProMP: Proximal Meta-Policy Search
Jonas Rothfuss
Dennis Lee
I. Clavera
Tamim Asfour
Pieter Abbeel
27
209
0
16 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLM
OffRL
28
144
0
15 Oct 2018
RUDDER: Return Decomposition for Delayed Rewards
RUDDER: Return Decomposition for Delayed Rewards
Jose A. Arjona-Medina
Michael Gillhofer
Michael Widrich
Thomas Unterthiner
Johannes Brandstetter
Sepp Hochreiter
30
212
0
20 Jun 2018
Previous
12345
Next