Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2501.09080
Cited By
v1
v2 (latest)
Average-Reward Soft Actor-Critic
15 January 2025
Jacob Adamczyk
Volodymyr Makarenko
Stas Tiomkin
R. Kulkarni
OOD
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Average-Reward Soft Actor-Critic"
1 / 1 papers shown
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
Abdullah Vanlioglu
381
12
0
28 Mar 2025
1
Page 1 of 1