Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
Alignment Tax Reduction

Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction

22 May 2024

Rui Yan

Papers citing "Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction"

6 / 6 papers shown

Title
On the Loss of Context-awareness in General Instruction Fine-tuning Yihan Wang Andrew Bai Nanyun Peng Cho-Jui Hsieh 63 1 0 05 Nov 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models Jinliang Lu Ziliang Pang Min Xiao Yaochen Zhu Rui Xia Jiajun Zhang MoMe 27 17 0 08 Jul 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies Alexandre Ramé Johan Ferret Nino Vieillard Robert Dadashi Léonard Hussenot Pierre-Louis Cedoz Pier Giuseppe Sessa Sertan Girgin Arthur Douillard Olivier Bachem 47 13 0 24 Jun 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment Bolian Li Yifan Wang A. Grama Ruqi Zhang Ruqi Zhang AI4TS 47 8 0 24 Jun 2024
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models Tianlu Wang Rohit Sridhar Diyi Yang Xuezhi Wang AAML 108 52 0 14 Oct 2021