ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.02186
  4. Cited By
Distilling Policy Distillation

Distilling Policy Distillation

6 February 2019
Wojciech M. Czarnecki
Razvan Pascanu
Simon Osindero
Siddhant M. Jayakumar
G. Swirszcz
Max Jaderberg
ArXiv (abs)PDFHTML

Papers citing "Distilling Policy Distillation"

41 / 41 papers shown
Title
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao
Fanqi Wan
Jiajian Guo
Xiaojun Quan
Qifan Wang
ALM
146
0
0
25 Feb 2025
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
202
7
0
22 Oct 2024
Online Control-Informed Learning
Online Control-Informed Learning
Zihao Liang
Tianyu Zhou
Zehui Lu
Shaoshuai Mou
122
1
0
04 Oct 2024
TacSL: A Library for Visuotactile Sensor Simulation and Learning
TacSL: A Library for Visuotactile Sensor Simulation and Learning
Iretiayo Akinola
Jie Xu
Jan Carius
Dieter Fox
Yashraj S. Narang
129
10
0
12 Aug 2024
Proximal Policy Distillation
Proximal Policy Distillation
Giacomo Spigler
OffRL
82
1
0
21 Jul 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
98
0
0
25 Apr 2024
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal
  Morphology Control
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Zheng Xiong
Risto Vuorio
Jacob Beck
Matthieu Zimmer
Kun Shao
Shimon Whiteson
89
1
0
09 Feb 2024
Human-Timescale Adaptation in an Open-Ended Task Space
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team
Jakob Bauer
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
...
Jakub Sygnowski
K. Tuyls
Sarah York
Alexander Zacherl
Lei Zhang
LM&RoOffRLAI4CELRM
139
119
0
18 Jan 2023
AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning
AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning
Hongjie Zhang
OffRL
38
0
0
28 Nov 2022
Honor of Kings Arena: an Environment for Generalization in Competitive
  Reinforcement Learning
Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning
Hua Wei
Jingxiao Chen
Xiyang Ji
Hongyang Qin
Minwen Deng
...
Lin Liu
Lanxiao Huang
Deheng Ye
Qiang Fu
Wei Yang
81
31
0
18 Sep 2022
Learning Dynamics and Generalization in Reinforcement Learning
Learning Dynamics and Generalization in Reinforcement Learning
Clare Lyle
Mark Rowland
Will Dabney
Marta Z. Kwiatkowska
Y. Gal
OODOffRL
74
13
0
05 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRLOnRL
126
66
0
03 Jun 2022
Multi-Source Transfer Learning for Deep Model-Based Reinforcement
  Learning
Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning
Remo Sasso
M. Sabatelli
M. Wiering
104
9
0
28 May 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
69
9
0
23 Feb 2022
Retrieval-Augmented Reinforcement Learning
Retrieval-Augmented Reinforcement Learning
Anirudh Goyal
A. Friesen
Andrea Banino
T. Weber
Nan Rosemary Ke
...
Michal Valko
Simon Osindero
Timothy Lillicrap
N. Heess
Charles Blundell
OffRL
87
55
0
17 Feb 2022
Efficient Policy Space Response Oracles
Efficient Policy Space Response Oracles
Ming Zhou
Jingxiao Chen
Ying Wen
Weinan Zhang
Yaodong Yang
Yong Yu
Jun Wang
136
10
0
28 Jan 2022
Learning robust perceptive locomotion for quadrupedal robots in the wild
Learning robust perceptive locomotion for quadrupedal robots in the wild
Takahiro Miki
Joonho Lee
Jemin Hwangbo
Lorenz Wellhausen
V. Koltun
Marco Hutter
137
716
0
20 Jan 2022
Robot Learning from Randomized Simulations: A Review
Robot Learning from Randomized Simulations: A Review
Fabio Muratore
Fabio Ramos
Greg Turk
Wenhao Yu
Michael Gienger
Jan Peters
AI4CE
119
83
0
01 Nov 2021
Offline RL With Resource Constrained Online Deployment
Offline RL With Resource Constrained Online Deployment
Jayanth Reddy Regatti
A. Deshmukh
Frank Cheng
Young Hun Jung
Abhishek Gupta
Ürün Dogan
OffRL
74
2
0
07 Oct 2021
DCUR: Data Curriculum for Teaching via Samples with Reinforcement
  Learning
DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning
Daniel Seita
Abhinav Gopal
Zhao Mandi
John F. Canny
OffRLOnRL
39
0
0
15 Sep 2021
Open-Ended Learning Leads to Generally Capable Agents
Open-Ended Learning Leads to Generally Capable Agents
Open-Ended Learning Team
Adam Stooke
Anuj Mahajan
Catarina Barros
Charlie Deck
...
Nicolas Porcel
Roberta Raileanu
Steph Hughes-Fitt
Valentin Dalibard
Wojciech M. Czarnecki
124
190
0
27 Jul 2021
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual
  Policies
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
Linxi Fan
Guanzhi Wang
De-An Huang
Zhiding Yu
Li Fei-Fei
Yuke Zhu
Anima Anandkumar
OffRL
139
65
0
17 Jun 2021
Pattern Transfer Learning for Reinforcement Learning in Order
  Dispatching
Pattern Transfer Learning for Reinforcement Learning in Order Dispatching
Runzhe Wan
Sheng Zhang
C. Shi
Shuang Luo
R. Song
AI4TS
60
3
0
27 May 2021
Spectral Normalisation for Deep Reinforcement Learning: an Optimisation
  Perspective
Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective
Florin Gogianu
Tudor Berariu
Mihaela Rosca
Claudia Clopath
L. Buşoniu
Razvan Pascanu
86
56
0
11 May 2021
Human-Inspired Multi-Agent Navigation using Knowledge Distillation
Human-Inspired Multi-Agent Navigation using Knowledge Distillation
Pei Xu
Ioannis Karamouzas
125
19
0
18 Mar 2021
Auto-Agent-Distiller: Towards Efficient Deep Reinforcement Learning Agents via Neural Architecture Search
Auto-Agent-Distiller: Towards Efficient Deep Reinforcement Learning Agents via Neural Architecture Search
Y. Fu
Zhongzhi Yu
Yongan Zhang
Yingyan Lin
83
4
0
24 Dec 2020
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer
  Distillation
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
Chenyang Zhao
Timothy M. Hospedales
OOD
49
16
0
09 Dec 2020
Towards Playing Full MOBA Games with Deep Reinforcement Learning
Towards Playing Full MOBA Games with Deep Reinforcement Learning
Deheng Ye
Guibin Chen
Wen Zhang
Sheng Chen
Bo Yuan
...
Tengfei Shi
Qiang Fu
Wei Yang
Lanxiao Huang
Wei Liu
95
188
0
25 Nov 2020
Supervised Learning Achieves Human-Level Performance in MOBA Games: A
  Case Study of Honor of Kings
Supervised Learning Achieves Human-Level Performance in MOBA Games: A Case Study of Honor of Kings
Deheng Ye
Guibin Chen
P. Zhao
Fuhao Qiu
Bo Yuan
...
Liang Wang
Tengfei Shi
Qiang Fu
Wei Yang
Lanxiao Huang
86
50
0
25 Nov 2020
Meta Automatic Curriculum Learning
Meta Automatic Curriculum Learning
Rémy Portelas
Clément Romac
Katja Hofmann
Pierre-Yves Oudeyer
66
8
0
16 Nov 2020
Knowledge Transfer in Multi-Task Deep Reinforcement Learning for
  Continuous Control
Knowledge Transfer in Multi-Task Deep Reinforcement Learning for Continuous Control
Zhiyuan Xu
Kun Wu
Zhengping Che
Jian Tang
Jieping Ye
CLLOffRL
92
49
0
15 Oct 2020
Transfer Learning in Deep Reinforcement Learning: A Survey
Transfer Learning in Deep Reinforcement Learning: A Survey
Zhuangdi Zhu
Kaixiang Lin
Anil K. Jain
Jiayu Zhou
OffRLLRM
123
602
0
16 Sep 2020
Transient Non-Stationarity and Generalisation in Deep Reinforcement
  Learning
Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning
Maximilian Igl
Gregory Farquhar
Jelena Luketina
Wendelin Boehmer
Shimon Whiteson
130
88
0
10 Jun 2020
Preventing Posterior Collapse with Levenshtein Variational Autoencoder
Preventing Posterior Collapse with Levenshtein Variational Autoencoder
Serhii Havrylov
Ivan Titov
DRL
30
19
0
30 Apr 2020
Adaptive Partial Scanning Transmission Electron Microscopy with
  Reinforcement Learning
Adaptive Partial Scanning Transmission Electron Microscopy with Reinforcement Learning
Jeffrey M. Ede
110
13
0
06 Apr 2020
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning For Deep RL: A Short Survey
Rémy Portelas
Cédric Colas
Lilian Weng
Katja Hofmann
Pierre-Yves Oudeyer
ODL
114
176
0
10 Mar 2020
Challenges and Countermeasures for Adversarial Attacks on Deep
  Reinforcement Learning
Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement Learning
Inaam Ilahi
Muhammad Usama
Junaid Qadir
M. Janjua
Ala I. Al-Fuqaha
D. Hoang
Dusit Niyato
AAML
145
136
0
27 Jan 2020
Gradient Surgery for Multi-Task Learning
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
197
1,235
0
19 Jan 2020
Joint Goal and Strategy Inference across Heterogeneous Demonstrators via
  Reward Network Distillation
Joint Goal and Strategy Inference across Heterogeneous Demonstrators via Reward Network Distillation
Letian Chen
Rohan R. Paleja
Muyleng Ghuy
Matthew C. Gombolay
109
39
0
02 Jan 2020
Discrete and Continuous Action Representation for Practical RL in Video
  Games
Discrete and Continuous Action Representation for Practical RL in Video Games
Olivier Delalleau
Maxim Peter
Eloi Alonso
Adrien Logut
79
53
0
23 Dec 2019
Solving Rubik's Cube with a Robot Hand
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
126
1,235
0
16 Oct 2019
1