Robust Preference Optimization via Dynamic Target MarginsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Aligning Large Language Models with Implicit Preferences from User-Generated ContentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Multi-objective Aligned Bidword Generation Model for E-commerce Search AdvertisingAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025 |
daDPO: Distribution-Aware DPO for Distilling Conversational AbilitiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Probability-Consistent Preference Optimization for Enhanced LLM ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Frictional Agent Alignment Framework: Slow Down and Don't Break ThingsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Token-level Accept or Reject: A Micro Alignment Approach for Large Language ModelsInternational Joint Conference on Artificial Intelligence (IJCAI), 2025 |
Optimal Transport-Based Token Weighting scheme for Enhanced Preference OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
MPO: Multilingual Safety Alignment via Reward Gap OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
SGDPO: Self-Guided Direct Preference Optimization for Language Model AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |