ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2512.20173
  4. Cited By
Offline Safe Policy Optimization From Heterogeneous Feedback

Offline Safe Policy Optimization From Heterogeneous Feedback

24 December 2025
Ze Gong
Pradeep Varakantham
Akshat Kumar
    OffRLGPOnRLKELMAAMLCLLSILMPERALMAI4CE3DHBDLSyDaLM&RoAI4MHUDSSL3DVLM&MAMULLMSVLMTDLRMMQAI4TSMLTOSLMELMMLAUPICVWSOLHAIDMLAILawPINNPILMReCodUQCVReLMWaLM3DGSAIFinVOTXAIUQLMMIALMMoEMILMCLIPRALMAI4ClHILMVLMMedImSLRSSegTTAOTCMLOCLAI4EdISeg3DPCFedMLVGenWSODCoGeViTMGenNAIFAttSupRMDEGNNAuLLM
ArXiv (abs)PDFHTMLGithub (3120★)

Papers citing "Offline Safe Policy Optimization From Heterogeneous Feedback"

0 / 0 papers shown

No papers found

Page 1 of 0