SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

14 February 2024

Papers citing "SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks"

3 / 3 papers shown

Title
A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs Xuan Ding Rui Sun Yunjian Zhang Xiu Yan Yueqi Zhou Kaihao Huang Suzhong Fu Chuanlong Xie Yao Zhu 41 0 0 26 Feb 2025
MoDeGPT: Modular Decomposition for Large Language Model Compression Chi-Heng Lin Shangqian Gao James Seale Smith Abhishek Patel Shikhar Tuli Yilin Shen Hongxia Jin Yen-Chang Hsu 59 6 0 19 Aug 2024
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Saleh Ashkboos Maximilian L. Croci Marcelo Gennari do Nascimento Torsten Hoefler James Hensman VLM 114 143 0 26 Jan 2024