ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.02034
  4. Cited By
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
v1v2 (latest)

Dropout Q-Functions for Doubly Efficient Reinforcement Learning

5 October 2021
Takuya Hiraoka
Takahisa Imagawa
Taisei Hashimoto
Takashi Onishi
Yoshimasa Tsuruoka
ArXiv (abs)PDFHTML

Papers citing "Dropout Q-Functions for Doubly Efficient Reinforcement Learning"

50 / 82 papers shown
Title
Dexterous Robotic Piano Playing at Scale
Dexterous Robotic Piano Playing at Scale
Le Chen
Yi Zhao
Jan Schneider
Quankai Gao
Simon Guist
Cheng Qian
Juho Kannala
Bernhard Schölkopf
Joni Pajarinen
Dieter Büchler
144
0
0
04 Nov 2025
XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning
XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning
Daniel Palenicek
Florian Vogt
Joe Watson
Ingmar Posner
Jan Peters
124
0
0
29 Sep 2025
An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms
An Investigation of Batch Normalization in Off-Policy Actor-Critic Algorithms
Li Wang
Sudun
X. Zhang
Wenjun Wu
Lei Huang
OffRL
132
0
0
28 Sep 2025
Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning
Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning
Chengyandan Shen
Christoffer Sloth
OffRL
92
0
0
04 Sep 2025
A Tutorial: An Intuitive Explanation of Offline Reinforcement Learning Theory
A Tutorial: An Intuitive Explanation of Offline Reinforcement Learning Theory
Fengdi Che
OffRL
108
0
0
11 Aug 2025
Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies
Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies
Yi Ma
Hongyao Tang
Chenjun Xiao
Yaodong Yang
Wei Wei
Jianye Hao
Jiye Liang
OffRL
164
0
0
05 Aug 2025
Scaling Algorithm Distillation for Continuous Control with Mamba
Scaling Algorithm Distillation for Continuous Control with Mamba
Samuel Beaussant
Mehdi Mounsif
155
0
0
16 Jun 2025
Scaling CrossQ with Weight Normalization
Scaling CrossQ with Weight Normalization
Daniel Palenicek
Florian Vogt
Jan Peters
247
1
0
04 Jun 2025
Growable and Interpretable Neural Control with Online Continual Learning for Autonomous Lifelong Locomotion Learning Machines
Growable and Interpretable Neural Control with Online Continual Learning for Autonomous Lifelong Locomotion Learning MachinesThe international journal of robotics research (IJRR), 2025
Arthicha Srisuchinnawong
Poramate Manoonpong
CLLLRM
266
3
0
17 May 2025
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Ukjo Hwang
Songnam Hong
OffRL
184
0
0
14 Apr 2025
Learning to Play Piano in the Real World
Learning to Play Piano in the Real World
Yves-Simon Zeulner
Sandeep Selvaraj
Roberto Calandra
241
1
0
19 Mar 2025
Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion
Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion
Nico Bohlinger
Jonathan Kinzel
Daniel Palenicek
Lukasz Antczak
Jan Peters
227
6
0
11 Mar 2025
Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design
Performance Comparisons of Reinforcement Learning Algorithms for Sequential Experimental Design
Yasir Zubayr Barlas
Kizito Salako
185
1
0
07 Mar 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
402
9
0
21 Feb 2025
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-Tuning
Rapidly Adapting Policies to the Real World via Simulation-Guided Fine-TuningInternational Conference on Learning Representations (ICLR), 2025
Patrick Yin
Tyler Westenbroek
Simran Bagaria
Kevin Huang
Ching-an Cheng
Andrey Kobolov
Abhishek Gupta
375
8
0
04 Feb 2025
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Offline-to-online Reinforcement Learning for Image-based Grasping with Scarce Demonstrations
Bryan Chan
Anson Leung
James Bergstra
OffRLOnRL
422
2
0
19 Oct 2024
Traversability-Aware Legged Navigation by Learning from Real-World
  Visual Data
Traversability-Aware Legged Navigation by Learning from Real-World Visual Data
Hongbo Zhang
Zhongyu Li
Xuanqi Zeng
Laura Smith
Kyle Stachowicz
...
Zhitao Song
Weipeng Xia
Sergey Levine
Koushil Sreenath
Yao Xiao
228
4
0
14 Oct 2024
Reinforcement Learning For Quadrupedal Locomotion: Current Advancements
  And Future Perspectives
Reinforcement Learning For Quadrupedal Locomotion: Current Advancements And Future Perspectives
Maurya Gurram
Prakash Kumar Uttam
Shantipal S. Ohol
OffRL
316
1
0
14 Oct 2024
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2024
Hojoon Lee
Dongyoon Hwang
Donghu Kim
Hyunseung Kim
Jun Jet Tai
K. Subramanian
Peter R. Wurman
Jaegul Choo
Peter Stone
Takuma Seno
OffRL
383
39
0
13 Oct 2024
Learning to Walk from Three Minutes of Real-World Data with
  Semi-structured Dynamics Models
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics ModelsConference on Robot Learning (CoRL), 2024
Jacob Levy
T. Westenbroek
David Fridovich-Keil
288
14
0
11 Oct 2024
Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent
  Reinforcement Learning
Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Xinran Li
Ling Pan
Jun Zhang
190
5
0
11 Oct 2024
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RLInternational Conference on Learning Representations (ICLR), 2024
C. Voelcker
Marcel Hussing
Eric Eaton
Amir-massoud Farahmand
Igor Gilitschenski
368
10
0
11 Oct 2024
The Role of Deep Learning Regularizations on Actors in Offline RL
The Role of Deep Learning Regularizations on Actors in Offline RL
Denis Tarasov
Anja Surina
Çağlar Gülçehre
OffRLAI4CE
306
2
0
11 Sep 2024
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual
  Dexterous Robot Hands
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot HandsConference on Robot Learning (CoRL), 2024
Yi Zhao
Le Chen
Jan Schneider
Quankai Gao
Arno Solin
Bernhard Scholkopf
Joni Pajarinen
Le Chen
164
6
0
20 Aug 2024
Scenario-based Thermal Management Parametrization Through Deep
  Reinforcement Learning
Scenario-based Thermal Management Parametrization Through Deep Reinforcement Learning
Thomas Rudolf
Philip Muhl
Sören Hohmann
Lutz Eckstein
182
2
0
04 Aug 2024
HiLMa-Res: A General Hierarchical Framework via Residual RL for
  Combining Quadrupedal Locomotion and Manipulation
HiLMa-Res: A General Hierarchical Framework via Residual RL for Combining Quadrupedal Locomotion and Manipulation
Xiaoyu Huang
Qiayuan Liao
Yiming Ni
Zhongyu Li
Laura Smith
Sergey Levine
Xue Bin Peng
Koushil Sreenath
183
7
0
09 Jul 2024
Augmented Bayesian Policy Search
Augmented Bayesian Policy Search
Mahdi Kallel
Debabrota Basu
R. Akrour
Carlo DÉramo
169
4
0
05 Jul 2024
BricksRL: A Platform for Democratizing Robotics and Reinforcement
  Learning Research and Education with LEGO
BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO
Sebastian Dittert
Vincent Moens
Gianni De Fabritiis
216
1
0
25 Jun 2024
Learning-based legged locomotion; state of the art and future
  perspectives
Learning-based legged locomotion; state of the art and future perspectives
Sehoon Ha
Joonho Lee
M. van de Panne
Zhaoming Xie
Wenhao Yu
Majid Khadiv
284
22
0
03 Jun 2024
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
268
4
0
29 May 2024
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Marcel Hussing
Michael Kearns
Aaron Roth
S. B. Sengupta
Jessica Sorrell
182
0
0
27 May 2024
Bigger, Regularized, Optimistic: scaling for compute and
  sample-efficient continuous control
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Michal Nauman
M. Ostaszewski
Krzysztof Jankowski
Piotr Milo's
Marek Cygan
OffRL
228
61
0
25 May 2024
Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences
Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences
Takuya Hiraoka
Guanquan Wang
Takashi Onishi
Yoshimasa Tsuruoka
234
0
0
23 May 2024
An Efficient Learning Control Framework With Sim-to-Real for String-Type Artificial Muscle-Driven Robotic Systems
An Efficient Learning Control Framework With Sim-to-Real for String-Type Artificial Muscle-Driven Robotic SystemsIEEE/ASME transactions on mechatronics (TAM), 2024
Jiyue Tao
Yunsong Zhang
Sunil Kumar Rajendran
Feitian Zhang
424
0
0
17 May 2024
Smart Sampling: Self-Attention and Bootstrapping for Improved Ensembled
  Q-Learning
Smart Sampling: Self-Attention and Bootstrapping for Improved Ensembled Q-LearningThe Florida AI Research Society (FLAIRS), 2024
M. Khan
Syed Hammad Ahmed
G. Sukthankar
152
0
0
14 May 2024
AFU: Actor-Free critic Updates in off-policy RL for continuous control
AFU: Actor-Free critic Updates in off-policy RL for continuous control
Nicolas Perrin-Gilbert
OffRL
244
0
0
24 Apr 2024
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang
Davin Tjia
Jacob Berg
Dima Damen
Pulkit Agrawal
Abhishek Gupta
OffRL
171
14
0
23 Apr 2024
Diverse Randomized Value Functions: A Provably Pessimistic Approach for
  Offline Reinforcement Learning
Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning
Xudong Yu
Chenjia Bai
Hongyi Guo
Changhong Wang
Zhen Wang
OffRL
273
0
0
09 Apr 2024
Learning Off-policy with Model-based Intrinsic Motivation For Active
  Online Exploration
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration
Yibo Wang
Jiang Zhao
OffRLOnRL
199
0
0
31 Mar 2024
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online
  Reinforcement Learning
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement LearningAAAI Conference on Artificial Intelligence (AAAI), 2024
Motoki Omura
Takayuki Osa
Yusuke Mukuta
Tatsuya Harada
OffRL
142
1
0
12 Mar 2024
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Dissecting Deep RL with High Update Ratios: Combatting Value Divergence
Marcel Hussing
C. Voelcker
Igor Gilitschenski
Amir-massoud Farahmand
Eric Eaton
301
11
0
09 Mar 2024
A Case for Validation Buffer in Pessimistic Actor-Critic
A Case for Validation Buffer in Pessimistic Actor-Critic
Michal Nauman
M. Ostaszewski
Marek Cygan
214
0
0
01 Mar 2024
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter
  Lesson of Reinforcement Learning
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman
Michal Bortkiewicz
Piotr Milo's
Tomasz Trzciñski
M. Ostaszewski
Marek Cygan
OffRL
293
40
0
01 Mar 2024
In value-based deep reinforcement learning, a pruned network is a good
  network
In value-based deep reinforcement learning, a pruned network is a good network
J. Obando-Ceron
Rameswar Panda
Pablo Samuel Castro
OffRL
446
31
0
19 Feb 2024
Bridging Evolutionary Algorithms and Reinforcement Learning: A
  Comprehensive Survey on Hybrid Algorithms
Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid AlgorithmsIEEE Transactions on Evolutionary Computation (IEEE Trans. Evol. Comput.), 2024
Pengyi Li
Jianye Hao
Hongyao Tang
Xian Fu
Yan Zheng
Ke Tang
279
42
0
22 Jan 2024
ReACT: Reinforcement Learning for Controller Parametrization using
  B-Spline Geometries
ReACT: Reinforcement Learning for Controller Parametrization using B-Spline GeometriesIEEE International Conference on Systems, Man and Cybernetics (SMC), 2023
Thomas Rudolf
Daniel Flögel
Tobias Schürmann
Simon Süß
S. Schwab
Sören Hohmann
AI4CE
189
1
0
10 Jan 2024
A unified uncertainty-aware exploration: Combining epistemic and
  aleatory uncertainty
A unified uncertainty-aware exploration: Combining epistemic and aleatory uncertainty
Parvin Malekzadeh
Ming Hou
Konstantinos N. Plataniotis
UD
168
5
0
05 Jan 2024
Efficient Sparse-Reward Goal-Conditioned Reinforcement Learning with a
  High Replay Ratio and Regularization
Efficient Sparse-Reward Goal-Conditioned Reinforcement Learning with a High Replay Ratio and Regularization
Takuya Hiraoka
OffRL
237
1
0
10 Dec 2023
Handling Cost and Constraints with Off-Policy Deep Reinforcement
  Learning
Handling Cost and Constraints with Off-Policy Deep Reinforcement Learning
Jared Markowitz
Jesse Silverberg
Gary Collins
OffRL
122
0
0
30 Nov 2023
Imitation Bootstrapped Reinforcement Learning
Imitation Bootstrapped Reinforcement Learning
Hengyuan Hu
Suvir Mirchandani
Dorsa Sadigh
353
47
0
03 Nov 2023
12
Next