ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.04712
  4. Cited By
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization
v1v2 (latest)

SEE-DPO: Self Entropy Enhanced Direct Preference Optimization

6 November 2024
Shivanshu Shekhar
Shreyas Singh
Tong Zhang
ArXiv (abs)PDFHTML

Papers citing "SEE-DPO: Self Entropy Enhanced Direct Preference Optimization"

5 / 5 papers shown
Title
Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models
Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models
Minghao Fu
Guo-Hua Wang
Tianyu Cui
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
236
1
0
05 Nov 2025
Sem-DPO: Mitigating Semantic Inconsistency in Preference Optimization for Prompt Engineering
Sem-DPO: Mitigating Semantic Inconsistency in Preference Optimization for Prompt Engineering
Anas Mohamed
A. Khan
Xinran Wang
Ahmad Faraz Khan
Shuwen Ge
Saman Bahzad Khan
Ayaan Ahmad
Ali Anwar
187
0
0
27 Jul 2025
LookAlike: Consistent Distractor Generation in Math MCQs
LookAlike: Consistent Distractor Generation in Math MCQsWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2025
Nisarg Parikh
Nigel Fernandez
Alexander Scarlatos
Simon Woodhead
Andrew Lan
487
3
0
03 May 2025
ROCM: RLHF on consistency models
Shivanshu Shekhar
Tong Zhang
180
0
0
08 Mar 2025
Self-Rewarding Language Models
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLMSyDaALMLRM
859
454
0
18 Jan 2024
1