Pruning vs Quantization: Which is Better?

Pruning vs Quantization: Which is Better?

6 July 2023

Tijmen Blankevoort

Papers citing "Pruning vs Quantization: Which is Better?"

10 / 10 papers shown

Title
Efficient LLM Inference using Dynamic Input Pruning and Cache-Aware Masking Marco Federici Davide Belli M. V. Baalen Amir Jalalirad Andrii Skliar Bence Major Markus Nagel Paul N. Whatmough 76 0 0 02 Dec 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance Jaskirat Singh Bram Adams Ahmed E. Hassan VLM 43 0 0 01 Nov 2024
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE Florence Regol Joud Chataoui Bertrand Charpentier Mark J. Coates Pablo Piantanida Stephan Gunnemann 45 0 0 20 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice Simla Burcu Harma Ayan Chakraborty Elizaveta Kostenok Danila Mishin Dongho Ha ... Martin Jaggi Ming Liu Yunho Oh Suvinay Subramanian Amir Yazdanbakhsh MQ 44 5 0 31 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs Jordan Dotzel Yuzong Chen Bahaa Kotb Sushma Prasad Gang Wu Sheng Li Mohamed S. Abdelfattah Zhiru Zhang 31 8 0 06 May 2024
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization Peng Hu Xi Peng Hongyuan Zhu M. Aly Jie Lin MQ 44 59 0 23 May 2022
Overcoming Oscillations in Quantization-Aware Training Markus Nagel Marios Fournarakis Yelysei Bondarenko Tijmen Blankevoort MQ 111 101 0 21 Mar 2022
Cyclical Pruning for Sparse Neural Networks Suraj Srinivas Andrey Kuzmin Markus Nagel M. V. Baalen Andrii Skliar Tijmen Blankevoort 25 13 0 02 Feb 2022
What is the State of Neural Network Pruning? Davis W. Blalock Jose Javier Gonzalez Ortiz Jonathan Frankle John Guttag 191 1,027 0 06 Mar 2020
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 296 39,198 0 01 Sep 2014