Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.19390
Cited By
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
28 March 2024
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Checkpoint Merging via Bayesian Optimization in LLM Pretraining"
15 / 15 papers shown
Title
Dynamic Fisher-weighted Model Merging via Bayesian Optimization
Sanwoo Lee
Jiahao Liu
Qifan Wang
J. Wang
Xunliang Cai
Yunfang Wu
MoMe
101
0
0
26 Apr 2025
Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging
Shi Jie Yu
Sehyun Choi
MoMe
45
0
0
23 Apr 2025
Learning Explainable Dense Reward Shapes via Bayesian Optimization
Ryan Koo
Ian Yang
Vipul Raheja
Mingyi Hong
Kwang-Sung Jun
Dongyeop Kang
26
0
0
22 Apr 2025
Never Start from Scratch: Expediting On-Device LLM Personalization via Explainable Model Selection
Haoming Wang
Boyuan Yang
Xiangyu Yin
Wei Gao
28
0
0
15 Apr 2025
Model Assembly Learning with Heterogeneous Layer Weight Merging
Yi-Kai Zhang
Jin Wang
Xu-Xiang Zhong
De-Chuan Zhan
Han-Jia Ye
MoMe
47
0
0
27 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
38
0
0
13 Mar 2025
Extrapolation Merging: Keep Improving With Extrapolation and Merging
Yiguan Lin
Bin Xu
Yinghao Li
Yang Gao
MoMe
57
1
0
05 Mar 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J. Zhang
MoMe
FedML
85
0
0
18 Feb 2025
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
Zhixiang Wang
Zhenyu Mao
Yixuan Qiao
Yunfang Wu
Biye Li
MoMe
73
0
0
17 Feb 2025
Bayesian Concept Bottleneck Models with LLM Priors
Jean Feng
Avni Kothari
Luke Zier
Chandan Singh
Yan Shuo Tan
18
2
0
21 Oct 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
34
18
0
08 Jul 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
Zhanyue Qin
Hairu Wang
Zhao Yang
Zecheng Wang
...
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
17
2
0
24 Jun 2024
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
130
355
0
01 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
102
93
0
22 Jan 2024
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
1