Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2308.12772
Cited By
Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward
Results in Control and Optimization (RCO), 2023
24 August 2023
Taisuke Kobayashi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward"
4 / 4 papers shown
Grower-in-the-Loop Interactive Reinforcement Learning for Greenhouse Climate Control
Maxiu Xiao
Jianglin Lan
Jingxing Yu
Eldert van Henten
Qiuju Xie
Congcong Sun
OffRL
AI4CE
302
0
0
29 May 2025
Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning
Maeva Guerrier
Karthik Soma
Hassan Fouad
Giovanni Beltrame
229
1
0
24 May 2025
Improvements of Dark Experience Replay and Reservoir Sampling towards Better Balance between Consolidation and Plasticity
Taisuke Kobayashi
CLL
185
0
0
29 Apr 2025
Revisiting Experience Replayable Conditions
Taisuke Kobayashi
340
4
0
15 Feb 2024
1