Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2503.00539
Cited By

Distributionally Robust Reinforcement Learning with Human Feedback

Distributionally Robust Reinforcement Learning with Human Feedback

1 March 2025

Debmalya Mandal

Paulius Sasnauskas

Goran Radanović

ArXiv (abs)PDF HTML Github

Papers citing "Distributionally Robust Reinforcement Learning with Human Feedback"

6 / 6 papers shown

General Intelligence-based Fragmentation (GIF): A framework for peak-labeled spectra simulation

General Intelligence-based Fragmentation (GIF): A framework for peak-labeled spectra simulation

Margaret R. Martin

128

0

0

11 Nov 2025

Lightweight Robust Direct Preference Optimization

Lightweight Robust Direct Preference Optimization

194

2

0

27 Oct 2025

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

494

8

0

26 May 2025

Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning

Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning

Francesco Quinzan

722

13

0

03 Apr 2025

Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown

Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown

Dong Yan

Wei Shen

Yuzi Yan

Jian Xie

460

57

0

01 Oct 2024

Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Sayak Ray Chowdhury

Ilija Bogunovic

496

9

0

26 Jul 2024

Page 1 of 1