Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2406.16330
Cited By
v1
v2 (latest)
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
24 June 2024
Deyuan Liu
Zhan Qin
Han Wang
Zhao Yang
Zecheng Wang
Fangying Rong
Qingbin Liu
Yanchao Hao
Xi Chen
Cunhang Fan
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging"
48 / 48 papers shown
Title
When Fewer Layers Break More Chains: Layer Pruning Harms Test-Time Scaling in LLMs
Keyu Wang
Tian Lyu
Guinan Su
Jonas Geiping
L. Yin
Marco Canini
Shiwei Liu
LRM
89
0
0
25 Oct 2025
Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation
Fei Wang
Li Shen
Liang Ding
Chao Xue
Ye Liu
Changxing Ding
92
0
0
17 Oct 2025
BaldWhisper: Faster Whisper with Head Shearing and Layer Merging
Yaya Sy
Christophe Cerisara
Irina Illina
40
0
0
06 Oct 2025
PUMA: Layer-Pruned Language Model for Efficient Unified Multimodal Retrieval with Modality-Adaptive Learning
Yibo Lyu
Rui Shao
Gongwei Chen
Yijie Zhu
Weili Guan
Liqiang Nie
157
7
0
10 Jul 2025
Reassessing Layer Pruning in LLMs: New Insights and Methods
Yao Lu
Hao Cheng
Yujie Fang
Zeyu Wang
Jiaheng Wei
Dongwei Xu
Qi Xuan
Xiaoniu Yang
Zhaowei Zhu
303
14
0
23 Nov 2024
MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning
Seungbeom Hu
ChanJun Park
Andrew Ferraiuolo
Sang-Ki Ko
Jinwoo Kim
Haein Song
Jieung Kim
262
2
0
24 Aug 2024
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Pala Tej Deep
Rishabh Bhardwaj
Soujanya Poria
MoMe
272
45
0
17 Jun 2024
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
Yuyan Zhou
Liang Song
Bingning Wang
Weipeng Chen
MoMe
299
39
0
17 Jun 2024
MindMerger: Efficient Boosting LLM Reasoning in non-English Languages
Zixian Huang
Wenhao Zhu
Gong Cheng
Lei Li
Fei Yuan
LRM
192
24
0
27 May 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
345
315
0
28 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
231
26
0
28 Mar 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
380
152
0
26 Mar 2024
What Makes Quantization for Large Language Models Hard? An Empirical Study from the Lens of Perturbation
AAAI Conference on Artificial Intelligence (AAAI), 2024
Zhuocheng Gong
Jiahao Liu
Jingang Wang
Xunliang Cai
Dongyan Zhao
Rui Yan
MQ
117
18
0
11 Mar 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
256
233
0
06 Mar 2024
Evaluating Quantized Large Language Models
Shiyao Li
Xuefei Ning
Luning Wang
Tengxuan Liu
Xiangsheng Shi
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MQ
219
75
0
28 Feb 2024
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Jiwon Song
Kyungseok Oh
Taesu Kim
Hyungjun Kim
Yulhwa Kim
Jae-Joon Kim
408
59
0
14 Feb 2024
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
206
98
0
19 Jan 2024
Mixtral of Experts
Albert Q. Jiang
Alexandre Sablayrolles
Antoine Roux
A. Mensch
Blanche Savary
...
Théophile Gervet
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LLMAG
451
1,506
0
08 Jan 2024
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu
Tri Dao
Mamba
482
4,843
0
01 Dec 2023
Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization
Alexandra Chronopoulou
Jonas Pfeiffer
Joshua Maynez
Xinyi Wang
Sebastian Ruder
Priyanka Agrawal
MoMe
190
22
0
15 Nov 2023
Mistral 7B
Albert Q. Jiang
Alexandre Sablayrolles
A. Mensch
Chris Bamford
Devendra Singh Chaplot
...
Teven Le Scao
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoE
LRM
334
2,867
0
10 Oct 2023
AdaMerging: Adaptive Model Merging for Multi-Task Learning
International Conference on Learning Representations (ICLR), 2023
Enneng Yang
Zhenyi Wang
Li Shen
Shiwei Liu
Guibing Guo
Xingwei Wang
Dacheng Tao
MoMe
248
175
0
04 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
257
85
0
27 Sep 2023
A Survey on Model Compression for Large Language Models
Transactions of the Association for Computational Linguistics (TACL), 2023
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
294
335
0
15 Aug 2023
QLoRA: Efficient Finetuning of Quantized LLMs
Neural Information Processing Systems (NeurIPS), 2023
Tim Dettmers
Artidoro Pagnoni
Ari Holtzman
Luke Zettlemoyer
ALM
483
3,525
0
23 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Neural Information Processing Systems (NeurIPS), 2023
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
531
171
0
22 May 2023
LLM-Pruner: On the Structural Pruning of Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Xinyin Ma
Gongfan Fang
Xinchao Wang
569
635
0
19 May 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
2.6K
17,240
0
27 Feb 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
International Conference on Machine Learning (ICML), 2023
Elias Frantar
Dan Alistarh
VLM
405
991
0
02 Jan 2023
Editing Models with Task Arithmetic
International Conference on Learning Representations (ICLR), 2022
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
989
711
0
08 Dec 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
International Conference on Machine Learning (ICML), 2022
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
641
1,150
0
18 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
407
1,445
0
31 Oct 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
366
797
0
15 Aug 2022
PaLM: Scaling Language Modeling with Pathways
Journal of machine learning research (JMLR), 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
1.1K
7,285
0
05 Apr 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
International Conference on Machine Learning (ICML), 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
590
1,251
1
10 Mar 2022
Measuring Massive Multitask Language Understanding
International Conference on Learning Representations (ICLR), 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
1.3K
6,285
0
07 Sep 2020
Compression of Deep Learning Models for Text: A Survey
ACM Transactions on Knowledge Discovery from Data (TKDD), 2020
Manish Gupta
Puneet Agrawal
VLM
MedIm
AI4CE
437
132
0
12 Aug 2020
Language Models are Few-Shot Learners
Neural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.9K
51,131
0
28 May 2020
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Transactions of the Association for Computational Linguistics (TACL), 2020
Prakhar Ganesh
Yao Chen
Xin Lou
Mohammad Ali Khan
Yifan Yang
Hassan Sajjad
Preslav Nakov
Deming Chen
Marianne Winslett
AI4CE
337
213
0
27 Feb 2020
PIQA: Reasoning about Physical Commonsense in Natural Language
AAAI Conference on Artificial Intelligence (AAAI), 2019
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
1.1K
2,423
0
26 Nov 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
North American Chapter of the Association for Computational Linguistics (NAACL), 2019
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
607
1,979
0
24 May 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
521
3,346
0
19 May 2019
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes
John Healy
James Melville
833
10,995
0
09 Feb 2018
A Survey of Model Compression and Acceleration for Deep Neural Networks
Yu Cheng
Duo Wang
Pan Zhou
Zhang Tao
665
1,176
0
23 Oct 2017
RACE: Large-scale ReAding Comprehension Dataset From Examinations
Guokun Lai
Qizhe Xie
Hanxiao Liu
Yiming Yang
Eduard H. Hovy
ELM
712
1,498
0
15 Apr 2017
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
440
3,914
0
31 Aug 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
890
9,454
0
01 Oct 2015
Distilling the Knowledge in a Neural Network
Geoffrey E. Hinton
Oriol Vinyals
J. Dean
FedML
753
22,105
0
09 Mar 2015
1