Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.01382
Cited By
Compressing LLMs: The Truth is Rarely Pure and Never Simple
2 October 2023
Ajay Jaiswal
Zhe Gan
Xianzhi Du
Bowen Zhang
Zhangyang Wang
Yinfei Yang
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compressing LLMs: The Truth is Rarely Pure and Never Simple"
10 / 10 papers shown
Title
Stability in Single-Peaked Strategic Resource Selection Games
Henri Zeiler
9
0
0
09 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
12
0
0
05 May 2025
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
18
0
0
03 May 2025
iServe: An Intent-based Serving System for LLMs
Dimitrios Liakopoulos
Tianrui Hu
Prasoon Sinha
N. Yadwadkar
VLM
44
0
0
08 Jan 2025
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Nasib Ullah
Erik Schultheis
Mike Lasby
Yani Andrew Ioannou
Rohit Babbar
21
0
0
05 Nov 2024
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia
Jiancheng Liu
Yihua Zhang
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
28
2
0
23 Oct 2024
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
VLM
29
22
0
18 Jun 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
135
208
0
13 Mar 2023
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
75
332
0
05 Jan 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
139
375
0
23 Jul 2020
1