Square $χ$ PO: Differentially Private and Robust $χ^2$ -Preference Optimization in Offline Direct Alignment

27 May 2025

Papers citing "Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment"

3 / 3 papers shown

Title
A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO Xingyu Zhou Yulian Wu Francesco Orabona OffRL 100 1 0 21 May 2025
Foundations of Large Language Models Tong Xiao Jingbo Zhu 3DGS AILaw VLM 134 4 0 16 Jan 2025
The Central Role of the Loss Function in Reinforcement Learning Kaiwen Wang Nathan Kallus Wen Sun OffRL 292 10 0 19 Sep 2024