ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.14309
  4. Cited By
How to Learn a Useful Critic? Model-based Action-Gradient-Estimator
  Policy Optimization

How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization

29 April 2020
P. DÓro
Wojciech Ja'skowski
    OffRL
ArXivPDFHTML

Papers citing "How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization"

6 / 6 papers shown
Title
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Michael Psenka
Alejandro Escontrela
Pieter Abbeel
Yi-An Ma
DiffM
91
23
0
17 Feb 2025
Compatible Gradient Approximations for Actor-Critic Algorithms
Compatible Gradient Approximations for Actor-Critic Algorithms
Baturay Saglam
Dionysis Kalogerias
19
0
0
02 Sep 2024
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control
  via Sample Multiple Reuse
Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse
Jiafei Lyu
Le Wan
Zongqing Lu
Xiu Li
OffRL
26
9
0
29 May 2023
Is Model Ensemble Necessary? Model-based RL via a Single Model with
  Lipschitz Regularized Value Function
Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function
Ruijie Zheng
Xiyao Wang
Huazhe Xu
Furong Huang
38
13
0
02 Feb 2023
The Primacy Bias in Deep Reinforcement Learning
The Primacy Bias in Deep Reinforcement Learning
Evgenii Nikishin
Max Schwarzer
P. DÓro
Pierre-Luc Bacon
Aaron C. Courville
OnRL
90
178
0
16 May 2022
A case for new neural network smoothness constraints
A case for new neural network smoothness constraints
Mihaela Rosca
T. Weber
A. Gretton
S. Mohamed
AAML
25
48
0
14 Dec 2020
1