Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2308.12772
Cited By

Intentionally-underestimated Value Function at Terminal State for
Temporal-difference Learning with Mis-designed Reward

Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward

Results in Control and Optimization (RCO), 2023

24 August 2023

Taisuke Kobayashi

ArXiv (abs)PDF HTML

Papers citing "Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward"

4 / 4 papers shown

Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control

Eldert van Henten

302

0

0

29 May 2025

Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Giovanni Beltrame

229

1

0

24 May 2025

Improvements of Dark Experience Replay and Reservoir Sampling towards Better Balance between Consolidation and Plasticity

Improvements of Dark Experience Replay and Reservoir Sampling towards Better Balance between Consolidation and Plasticity

Taisuke Kobayashi

185

0

0

29 Apr 2025

Revisiting Experience Replayable Conditions

Revisiting Experience Replayable Conditions

Taisuke Kobayashi

340

4

0

15 Feb 2024