ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.05044
  4. Cited By
Learning to Explore with Meta-Policy Gradient
v1v2 (latest)

Learning to Explore with Meta-Policy Gradient

International Conference on Machine Learning (ICML), 2018
13 March 2018
Tianbing Xu
Qiang Liu
Bo Pan
Jian Peng
ArXiv (abs)PDFHTML

Papers citing "Learning to Explore with Meta-Policy Gradient"

32 / 32 papers shown
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model
  with Meta-Exploration
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-ExplorationNeural Information Processing Systems (NeurIPS), 2024
Yun-Yen Chuang
Hung-Min Hsu
Kevin Lin
Chen-Sheng Gu
Ling Zhen Li
Ray-I Chang
Hung-yi Lee
DiffMVLM
247
1
0
17 Oct 2024
Autonomous Driving at Unsignalized Intersections: A Review of
  Decision-Making Challenges and Reinforcement Learning-Based Solutions
Autonomous Driving at Unsignalized Intersections: A Review of Decision-Making Challenges and Reinforcement Learning-Based Solutions
Mohammad K. Al-Sharman
Luc Edes
Bert Sun
Vishal Jayakumar
Mohamed A. Daoud
Derek Rayside
W. Melek
250
5
0
20 Sep 2024
MESA: Cooperative Meta-Exploration in Multi-Agent Learning through
  Exploiting State-Action Space Structure
MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure
Zhicheng Zhang
Yancheng Liang
Yi Wu
Fei Fang
198
2
0
01 May 2024
Optimistic Meta-Gradients
Optimistic Meta-GradientsNeural Information Processing Systems (NeurIPS), 2023
Sebastian Flennerhag
Tom Zahavy
Brendan O'Donoghue
Hado van Hasselt
András Gyorgy
Satinder Singh
246
3
0
09 Jan 2023
A Survey of Exploration Methods in Reinforcement Learning
A Survey of Exploration Methods in Reinforcement Learning
Susan Amin
Maziar Gomrokchi
Harsh Satija
H. V. Hoof
Doina Precup
OffRL
317
99
0
01 Sep 2021
Learning an Explicit Hyperparameter Prediction Function Conditioned on
  Tasks
Learning an Explicit Hyperparameter Prediction Function Conditioned on Tasks
Jun Shu
Deyu Meng
Zongben Xu
303
12
0
06 Jul 2021
Improving Context-Based Meta-Reinforcement Learning with Self-Supervised
  Trajectory Contrastive Learning
Improving Context-Based Meta-Reinforcement Learning with Self-Supervised Trajectory Contrastive Learning
Bernie Wang
Si-ting Xu
Kurt Keutzer
Yang Gao
Bichen Wu
SSLOffRL
120
8
0
10 Mar 2021
Credit Assignment with Meta-Policy Gradient for Multi-Agent
  Reinforcement Learning
Credit Assignment with Meta-Policy Gradient for Multi-Agent Reinforcement Learning
Jianzhun Shao
Hongchang Zhang
Yuhang Jiang
Shuncheng He
Xiangyang Ji
204
5
0
24 Feb 2021
Rank the Episodes: A Simple Approach for Exploration in
  Procedurally-Generated Environments
Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated EnvironmentsInternational Conference on Learning Representations (ICLR), 2021
Daochen Zha
Wenye Ma
Lei Yuan
Helen Zhou
Ji Liu
249
47
0
20 Jan 2021
Locally Persistent Exploration in Continuous Control Tasks with Sparse
  Rewards
Locally Persistent Exploration in Continuous Control Tasks with Sparse RewardsInternational Conference on Machine Learning (ICML), 2020
Susan Amin
Maziar Gomrokchi
Hossein Aboutalebi
Harsh Satija
Doina Precup
175
17
0
26 Dec 2020
Towards Continual Reinforcement Learning: A Review and Perspectives
Towards Continual Reinforcement Learning: A Review and PerspectivesJournal of Artificial Intelligence Research (JAIR), 2020
Khimya Khetarpal
Matthew D Riemer
Irina Rish
Doina Precup
CLLOffRL
560
381
0
25 Dec 2020
Temporal Difference Uncertainties as a Signal for Exploration
Temporal Difference Uncertainties as a Signal for Exploration
Sebastian Flennerhag
Jane X. Wang
Pablo Sprechmann
Francesco Visin
Alexandre Galashov
Steven Kapturowski
Diana Borsa
N. Heess
André Barreto
Razvan Pascanu
OffRL
211
16
0
05 Oct 2020
OCEAN: Online Task Inference for Compositional Tasks with Context
  Adaptation
OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation
Hongyu Ren
Yuke Zhu
J. Leskovec
Anima Anandkumar
Animesh Garg
LRM
108
4
0
17 Aug 2020
Dual Policy Distillation
Dual Policy Distillation
Kwei-Herng Lai
Daochen Zha
Yuening Li
Helen Zhou
OffRL
214
49
0
07 Jun 2020
Meta-Learning in Neural Networks: A Survey
Meta-Learning in Neural Networks: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Timothy M. Hospedales
Antreas Antoniou
P. Micaelli
Amos Storkey
OOD
767
2,421
0
11 Apr 2020
Learning Context-aware Task Reasoning for Efficient Meta-reinforcement
  Learning
Learning Context-aware Task Reasoning for Efficient Meta-reinforcement LearningAdaptive Agents and Multi-Agent Systems (AAMAS), 2020
Haozhe Jasper Wang
Jiale Zhou
Xuming He
OffRLLRM
167
18
0
03 Mar 2020
What Can Learned Intrinsic Rewards Capture?
What Can Learned Intrinsic Rewards Capture?International Conference on Machine Learning (ICML), 2019
Zeyu Zheng
Junhyuk Oh
Matteo Hessel
Zhongwen Xu
M. Kroiss
H. V. Hasselt
David Silver
Satinder Singh
302
81
0
11 Dec 2019
Context-aware Active Multi-Step Reinforcement Learning
Context-aware Active Multi-Step Reinforcement Learning
Gang Chen
Dingcheng Li
Ran Xu
120
0
0
11 Nov 2019
Single Episode Policy Transfer in Reinforcement Learning
Single Episode Policy Transfer in Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2019
Jiachen Yang
Brenden K. Petersen
H. Zha
Daniel Faissol
OODOffRL
296
38
0
17 Oct 2019
A Review of Robot Learning for Manipulation: Challenges,
  Representations, and Algorithms
A Review of Robot Learning for Manipulation: Challenges, Representations, and AlgorithmsJournal of machine learning research (JMLR), 2019
Oliver Kroemer
S. Niekum
George Konidaris
399
445
0
06 Jul 2019
Experience Replay Optimization
Experience Replay OptimizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2019
Daochen Zha
Kwei-Herng Lai
Kaixiong Zhou
Helen Zhou
OffRL
146
116
0
19 Jun 2019
Efficient Exploration via State Marginal Matching
Efficient Exploration via State Marginal Matching
Lisa Lee
Benjamin Eysenbach
Emilio Parisotto
Eric Xing
Sergey Levine
Ruslan Salakhutdinov
358
271
0
12 Jun 2019
Learning Efficient and Effective Exploration Policies with
  Counterfactual Meta Policy
Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy
Ruihan Yang
Qiwei Ye
Tie-Yan Liu
101
0
0
28 May 2019
Meta Reinforcement Learning with Task Embedding and Shared Policy
Meta Reinforcement Learning with Task Embedding and Shared PolicyInternational Joint Conference on Artificial Intelligence (IJCAI), 2019
Lin Lan
Zhenguo Li
X. Guan
Peijie Wang
OffRL
337
52
0
16 May 2019
Multitask Soft Option Learning
Multitask Soft Option Learning
Maximilian Igl
Andrew Gambardella
Jinke He
Nantas Nardelli
N. Siddharth
Wendelin Bohmer
Shimon Whiteson
350
26
0
01 Apr 2019
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic
  Context Variables
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context VariablesInternational Conference on Machine Learning (ICML), 2019
Kate Rakelly
Aurick Zhou
Deirdre Quillen
Chelsea Finn
Sergey Levine
OffRL
283
746
0
19 Mar 2019
Learning Hierarchical Teaching Policies for Cooperative Agents
Learning Hierarchical Teaching Policies for Cooperative Agents
Dong-Ki Kim
Miao Liu
Shayegan Omidshafiei
Sebastian Lopez-Cot
Matthew D Riemer
Golnaz Habibi
Gerald Tesauro
Sami Mourad
Murray Campbell
Jonathan P. How
304
7
0
07 Mar 2019
Learning to Generalize from Sparse and Underspecified Rewards
Learning to Generalize from Sparse and Underspecified Rewards
Rishabh Agarwal
Chen Liang
Dale Schuurmans
Mohammad Norouzi
OffRL
477
103
0
19 Feb 2019
Meta-Learning for Contextual Bandit Exploration
Meta-Learning for Contextual Bandit Exploration
Amr Sharaf
Hal Daumé
OffRL
134
14
0
23 Jan 2019
NADPEx: An on-policy temporally consistent exploration method for deep
  reinforcement learning
NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning
Sirui Xie
Junning Huang
Lanxin Lei
Chunxiao Liu
Zheng Ma
Wayne Zhang
Liang Lin
112
9
0
21 Dec 2018
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using
  Meta-Learning
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
Mitchell Wortsman
Kiana Ehsani
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
SSL
396
248
0
03 Dec 2018
Small Sample Learning in Big Data Era
Small Sample Learning in Big Data Era
Jun Shu
Zongben Xu
Deyu Meng
355
78
0
14 Aug 2018
1
Page 1 of 1