ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.19720
  4. Cited By
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional
  Supervision

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision

25 October 2024
Shilong Li
Yancheng He
Hui Huang
Xingyuan Bu
J. Liu
Hangyu Guo
Weixun Wang
Jihao Gu
Wenbo Su
Bo Zheng
ArXivPDFHTML

Papers citing "2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision"

2 / 2 papers shown
Title
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
J. Liu
Hangyu Guo
Ranjie Duan
Xingyuan Bu
Yancheng He
...
Yingshui Tan
Yanan Wu
Jihao Gu
Y. Li
J. Zhu
MLLM
58
0
0
25 Apr 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Yancheng He
Shilong Li
J. Liu
Weixun Wang
Xingyuan Bu
...
Zhongyuan Peng
Z. Zhang
Zhicheng Zheng
Wenbo Su
Bo Zheng
ELM
LRM
60
6
0
26 Feb 2025
1