ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.06810
  4. Cited By

Mitigating Preference Hacking in Policy Optimization with Pessimism

10 March 2025
Dhawal Gupta
Adam Fisch
Christoph Dann
Alekh Agarwal
ArXivPDFHTML

Papers citing "Mitigating Preference Hacking in Policy Optimization with Pessimism"

Title
No papers