ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.03375
  4. Cited By
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement
  Learning

Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning

7 October 2021
Edoardo Cetin
Oya Celiktutan
    OffRL
ArXivPDFHTML

Papers citing "Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning"

12 / 12 papers shown
Title
Double Actor-Critic with TD Error-Driven Regularization in Reinforcement
  Learning
Double Actor-Critic with TD Error-Driven Regularization in Reinforcement Learning
Haohui Chen
Zhiyong Chen
Aoxiang Liu
Wentuo Fang
OffRL
20
0
0
28 Sep 2024
Bigger, Regularized, Optimistic: scaling for compute and
  sample-efficient continuous control
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Michal Nauman
M. Ostaszewski
Krzysztof Jankowski
Piotr Milo's
Marek Cygan
OffRL
29
16
0
25 May 2024
Simple Ingredients for Offline Reinforcement Learning
Simple Ingredients for Offline Reinforcement Learning
Edoardo Cetin
Andrea Tirinzoni
Matteo Pirotta
A. Lazaric
Yann Ollivier
Ahmed Touati
OffRL
24
2
0
19 Mar 2024
A Case for Validation Buffer in Pessimistic Actor-Critic
A Case for Validation Buffer in Pessimistic Actor-Critic
Michal Nauman
M. Ostaszewski
Marek Cygan
29
0
0
01 Mar 2024
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter
  Lesson of Reinforcement Learning
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman
Michal Bortkiewicz
Piotr Milo's
Tomasz Trzciñski
M. Ostaszewski
Marek Cygan
OffRL
22
16
0
01 Mar 2024
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
Michal Nauman
Marek Cygan
19
1
0
30 Oct 2023
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy
  Actor-Critic
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji
Yuping Luo
Fuchun Sun
Xianyuan Zhan
Jianwei Zhang
Huazhe Xu
OffRL
OnRL
31
14
0
05 Jun 2023
Policy Gradient With Serial Markov Chain Reasoning
Policy Gradient With Serial Markov Chain Reasoning
Edoardo Cetin
Oya Celiktutan
BDL
LRM
11
2
0
13 Oct 2022
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
Edoardo Cetin
Philip J. Ball
Steve Roberts
Oya Celiktutan
25
35
0
03 Jul 2022
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning
Nicolai Dorka
Tim Welschehold
Joschka Boedecker
Wolfram Burgard
OffRL
12
9
0
24 Nov 2021
Softmax Deep Double Deterministic Policy Gradients
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
Controlling Overestimation Bias with Truncated Mixture of Continuous
  Distributional Quantile Critics
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
Arsenii Kuznetsov
Pavel Shvechikov
Alexander Grishin
Dmitry Vetrov
131
184
0
08 May 2020
1