Same, Same But Different - Recovering Neural Network Quantization Error
Through Weight Factorization

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

5 February 2019

Alexander Finkelstein

Papers citing "Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization"

17 / 17 papers shown

Title
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance Jaskirat Singh Bram Adams Ahmed E. Hassan VLM 41 0 0 01 Nov 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance Jaskirat Singh Emad Fallahzadeh Bram Adams Ahmed E. Hassan MQ 32 3 0 25 Mar 2024
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs Shivam Aggarwal Hans Jakob Damsgaard Alessandro Pappalardo Giuseppe Franco Thomas B. Preußer Michaela Blott Tulika Mitra MQ 19 5 0 21 Nov 2023
Quantization Aware Factorization for Deep Neural Network Compression Daria Cherniuk Stanislav Abukhovich Anh-Huy Phan Ivan V. Oseledets A. Cichocki Julia Gusak MQ 15 2 0 08 Aug 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing Yelysei Bondarenko Markus Nagel Tijmen Blankevoort MQ 13 88 0 22 Jun 2023
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom Alexander Finkelstein Ella Fuchs Idan Tal Mark Grobman Niv Vosco Eldad Meller MQ 21 6 0 05 Dec 2022
MinUn: Accurate ML Inference on Microcontrollers Shikhar Jaiswal R. Goli Aayan Kumar Vivek Seshadri Rahul Sharma 21 2 0 29 Oct 2022
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment Jemin Lee Misun Yu Yongin Kwon Teaho Kim MQ 17 17 0 10 Feb 2022
An Embedding of ReLU Networks and an Analysis of their Identifiability Pierre Stock Rémi Gribonval 28 17 0 20 Jul 2021
A White Paper on Neural Network Quantization Markus Nagel Marios Fournarakis Rana Ali Amjad Yelysei Bondarenko M. V. Baalen Tijmen Blankevoort MQ 19 503 0 15 Jun 2021
Do All MobileNets Quantize Poorly? Gaining Insights into the Effect of Quantization on Depthwise Separable Convolutional Networks Through the Eyes of Multi-scale Distributional Dynamics S. Yun Alexander Wong MQ 19 25 0 24 Apr 2021
Automated Backend-Aware Post-Training Quantization Ziheng Jiang Animesh Jain An Liu Josh Fromm Chengqian Ma Tianqi Chen Luis Ceze MQ 35 2 0 27 Mar 2021
Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors C. Coelho Aki Kuusela Shane Li Zhuang Hao T. Aarrestad Vladimir Loncar J. Ngadiuba M. Pierini Adrian Alan Pol S. Summers MQ 24 175 0 15 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming Itay Hubara Yury Nahshan Y. Hanani Ron Banner Daniel Soudry MQ 21 122 0 14 Jun 2020
ZeroQ: A Novel Zero Shot Quantization Framework Yaohui Cai Z. Yao Zhen Dong A. Gholami Michael W. Mahoney Kurt Keutzer MQ 30 389 0 01 Jan 2020
Data-Free Quantization Through Weight Equalization and Bias Correction Markus Nagel M. V. Baalen Tijmen Blankevoort Max Welling MQ 19 499 0 11 Jun 2019
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights Aojun Zhou Anbang Yao Yiwen Guo Lin Xu Yurong Chen MQ 319 1,049 0 10 Feb 2017