ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.12138
  4. Cited By
Preference Optimization with Multi-Sample Comparisons

Preference Optimization with Multi-Sample Comparisons

16 October 2024
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
Xuefei Cao
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
ArXivPDFHTML

Papers citing "Preference Optimization with Multi-Sample Comparisons"

6 / 6 papers shown
Title
Anyprefer: An Agentic Framework for Preference Data Synthesis
Anyprefer: An Agentic Framework for Preference Data Synthesis
Yiyang Zhou
Z. Wang
Tianle Wang
Shangyu Xing
Peng Xia
...
Chetan Bansal
Weitong Zhang
Ying Wei
Mohit Bansal
Huaxiu Yao
52
0
0
27 Apr 2025
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation
Haibo Tong
Zhaoyang Wang
Z. Chen
Haonian Ji
Shi Qiu
...
Peng Xia
Mingyu Ding
Rafael Rafailov
Chelsea Finn
Huaxiu Yao
EGVM
VGen
67
2
0
03 Feb 2025
Diverse Preference Optimization
Diverse Preference Optimization
Jack Lanchantin
Angelica Chen
S. Dhuliawala
Ping Yu
Jason Weston
Sainbayar Sukhbaatar
Ilia Kulikov
80
3
0
30 Jan 2025
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Chaoqi Wang
Zhuokai Zhao
Yibo Jiang
Zhaorun Chen
Chen Zhu
...
Jiayi Liu
Lizhu Zhang
Xiangjun Fan
Hao Ma
Sinong Wang
62
3
0
17 Jan 2025
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
Hashmath Shaik
Alex Doboli
OffRL
ELM
50
0
0
31 Dec 2024
REFA: Reference Free Alignment for multi-preference optimization
REFA: Reference Free Alignment for multi-preference optimization
Taneesh Gupta
Rahul Madhavan
Xuchao Zhang
Chetan Bansal
Saravan Rajmohan
74
1
0
20 Dec 2024
1