ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00117
  4. Cited By
An Active Learning Framework for Efficient Robust Policy Search

An Active Learning Framework for Efficient Robust Policy Search

1 January 2019
Sai Kiran Narayanaswami
N. Sudarsanam
Balaraman Ravindran
ArXivPDFHTML

Papers citing "An Active Learning Framework for Efficient Robust Policy Search"

21 / 21 papers shown
Title
BayesSim: adaptive domain randomization via probabilistic inference for
  robotics simulators
BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators
F. Ramos
Rafael Possas
Dieter Fox
22
156
0
04 Jun 2019
Learning Robust Options by Conditional Value at Risk Optimization
Learning Robust Options by Conditional Value at Risk Optimization
Takuya Hiraoka
Takahisa Imagawa
Tatsuya Mori
Takashi Onishi
Yoshimasa Tsuruoka
15
27
0
22 May 2019
Bayesian Policy Optimization for Model Uncertainty
Bayesian Policy Optimization for Model Uncertainty
Gilwoo Lee
Brian Hou
Aditya Mandalika
Jeongseok Lee
Sanjiban Choudhury
S. Srinivasa
66
41
0
01 Oct 2018
Model-Ensemble Trust-Region Policy Optimization
Model-Ensemble Trust-Region Policy Optimization
Thanard Kurutach
I. Clavera
Yan Duan
Aviv Tamar
Pieter Abbeel
27
450
0
28 Feb 2018
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
Xue Bin Peng
Marcin Andrychowicz
Wojciech Zaremba
Pieter Abbeel
76
1,355
0
18 Oct 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
174
18,562
0
20 Jul 2017
Domain Randomization for Transferring Deep Neural Networks from
  Simulation to the Real World
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
Joshua Tobin
Rachel Fong
Alex Ray
Jonas Schneider
Wojciech Zaremba
Pieter Abbeel
77
2,948
0
20 Mar 2017
Robust Adversarial Reinforcement Learning
Robust Adversarial Reinforcement Learning
Lerrel Pinto
James Davidson
Rahul Sukthankar
Abhinav Gupta
OOD
65
848
0
08 Mar 2017
Preparing for the Unknown: Learning a Universal Policy with Online
  System Identification
Preparing for the Unknown: Learning a Universal Policy with Online System Identification
Wenhao Yu
Jie Tan
Chenxi Liu
Greg Turk
OffRL
49
306
0
08 Feb 2017
Sample Efficient Actor-Critic with Experience Replay
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
57
757
0
03 Nov 2016
EPOpt: Learning Robust Neural Network Policies Using Model Ensembles
EPOpt: Learning Robust Neural Network Policies Using Model Ensembles
Aravind Rajeswaran
Sarvjeet Ghotra
Balaraman Ravindran
Sergey Levine
46
349
0
05 Oct 2016
Progressive Neural Networks
Progressive Neural Networks
Andrei A. Rusu
Neil C. Rabinowitz
Guillaume Desjardins
Hubert Soyer
J. Kirkpatrick
Koray Kavukcuoglu
Razvan Pascanu
R. Hadsell
CLL
AI4CE
21
2,428
0
15 Jun 2016
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
Emilio Parisotto
Jimmy Lei Ba
Ruslan Salakhutdinov
OffRL
46
593
0
19 Nov 2015
Policy Distillation
Policy Distillation
Andrei A. Rusu
Sergio Gomez Colmenarejo
Çağlar Gülçehre
Guillaume Desjardins
J. Kirkpatrick
Razvan Pascanu
Volodymyr Mnih
Koray Kavukcuoglu
R. Hadsell
31
685
0
19 Nov 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
97
13,174
0
09 Sep 2015
Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed
  Bandits
Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits
Alexandra Carpentier
A. Lazaric
Mohammad Ghavamzadeh
Rémi Munos
P. Auer
András Antos
15
98
0
16 Jul 2015
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
20
3,350
0
08 Jun 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
203
6,722
0
19 Feb 2015
Optimizing the CVaR via Sampling
Optimizing the CVaR via Sampling
Aviv Tamar
Yonatan Glassner
Shie Mannor
31
186
0
15 Apr 2014
Thompson Sampling for Contextual Bandits with Linear Payoffs
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
57
993
0
15 Sep 2012
A Contextual-Bandit Approach to Personalized News Article Recommendation
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
105
2,935
0
28 Feb 2010
1