Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2508.17000
Cited By

KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF

KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF

23 August 2025

Jason Ross Brown

Edward James Young

Sergio Bacallado

ArXiv (abs)PDF HTML

Papers citing "KL-Regularised Q-Learning: A Token-level Action-Value perspective on Online RLHF"

0 / 0 papers shown

No papers found

Page 1 of 0