Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 513 papers shown

Title
AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures Qianlin Liang Prashant J. Shenoy David Irwin 126 21 0 27 Mar 2020
Compiling Neural Networks for a Computational Memory Accelerator K. Kourtis M. Dazzi Nikolas Ioannou Tobias Grosser Abu Sebastian E. Eleftheriou 103 5 0 05 Mar 2020
Searching for Winograd-aware Quantized NetworksConference on Machine Learning and Systems (MLSys), 2020 Javier Fernandez-Marques P. Whatmough Andrew Mundy Matthew Mattina MQ 116 40 0 25 Feb 2020
Post-training Quantization with Multiple Points: Mixed Precision without Mixed PrecisionAAAI Conference on Artificial Intelligence (AAAI), 2020 Xingchao Liu Mao Ye Dengyong Zhou Qiang Liu MQ 239 51 0 20 Feb 2020
Neural Network Compression Framework for fast model inference Alexander Kozlov Ivan Lazarevich Vasily Shamporov N. Lyalyushkin Yury Gorbachev 253 38 0 20 Feb 2020
SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantizationNeurocomputing (Neurocomputing), 2020 Lukas Enderich Fabian Timm Wolfram Burgard MQ 88 6 0 19 Feb 2020
Robust Quantization: One Model to Rule Them AllNeural Information Processing Systems (NeurIPS), 2020 Moran Shkolnik Brian Chmiel Ron Banner Gil Shomron Yury Nahshan A. Bronstein U. Weiser OOD MQ 187 88 0 18 Feb 2020
$Gradient $\ell_1$ Regularization for Quantization Robustness$ Gradient $\ell_1$ Regularization for Quantization RobustnessInternational Conference on Learning Representations (ICLR), 2020 Milad Alizadeh Arash Behboodi M. V. Baalen Christos Louizos Tijmen Blankevoort Max Welling MQ 148 8 0 18 Feb 2020
Post-Training Piecewise Linear Quantization for Deep Neural NetworksEuropean Conference on Computer Vision (ECCV), 2020 Jun Fang Ali Shafiee Hamzah Abdel-Aziz D. Thorsley Georgios Georgiadis Joseph Hassoun MQ 341 170 0 31 Jan 2020
Quantisation and Pruning for Neural Network Compression and Regularisation Kimessha Paupamah Steven D. James Richard Klein 75 25 0 14 Jan 2020
ZeroQ: A Novel Zero Shot Quantization FrameworkComputer Vision and Pattern Recognition (CVPR), 2020 Yaohui Cai Z. Yao Zhen Dong A. Gholami Michael W. Mahoney Kurt Keutzer MQ 221 452 0 01 Jan 2020
Towards Unified INT8 Training for Convolutional Neural NetworkComputer Vision and Pattern Recognition (CVPR), 2019 Feng Zhu Yazhe Niu F. Yu Xianglong Liu Yanfei Wang Zhelong Li Xiuqi Yang Junjie Yan MQ 203 172 0 29 Dec 2019
Towards Efficient Training for Neural Network Quantization Qing Jin Linjie Yang Zhenyu A. Liao MQ 223 42 0 21 Dec 2019
Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks Andrey Kuzmin Markus Nagel Saurabh Pitre Sandeep Pendyam Tijmen Blankevoort Max Welling 113 27 0 20 Dec 2019
Learned Variable-Rate Image Compression with Residual Divisive NormalizationIEEE International Conference on Multimedia and Expo (ICME), 2019 Mohammad Akbari Jie Liang Jingning Han Chengjie Tu 106 26 0 11 Dec 2019
The Knowledge Within: Methods for Data-Free Model CompressionComputer Vision and Pattern Recognition (CVPR), 2019 Matan Haroush Itay Hubara Elad Hoffer Daniel Soudry 198 113 0 03 Dec 2019
QKD: Quantization-aware Knowledge Distillation Jangho Kim Brandon Smart Jinwon Lee Chirag I. Patel Nojun Kwak MQ 193 72 0 28 Nov 2019
Loss Aware Post-training QuantizationMachine-mediated learning (ML), 2019 Yury Nahshan Brian Chmiel Chaim Baskin Evgenii Zheltonozhskii Ron Banner A. Bronstein A. Mendelson MQ 297 185 0 17 Nov 2019
Scientific Image Restoration AnywhereAnnual Workshop on Large-scale Experiment-in-the-Loop Computing (ALEC), 2019 V. Abeykoon Zhengchun Liu R. Kettimuthu Geoffrey C. Fox Ian Foster 159 19 0 12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural NetworksNeural Information Processing Systems (NeurIPS), 2019 Zhen Dong Z. Yao Yaohui Cai Daiyaan Arfeen A. Gholami Michael W. Mahoney Kurt Keutzer MQ 177 324 0 10 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables Hui Guan Andrey Malevich Jiyan Yang Jongsoo Park Hector Yuen MQ 143 43 0 05 Nov 2019
Training DNN IoT Applications for Deployment On Analog NVM CrossbarsIEEE International Joint Conference on Neural Network (IJCNN), 2019 F. García-Redondo Shidhartha Das G. Rosendale 213 5 0 30 Oct 2019
Secure Evaluation of Quantized Neural NetworksIACR Cryptology ePrint Archive (IACR ePrint), 2019 Anders Dalskov Daniel E. Escudero Marcel Keller 288 148 0 28 Oct 2019
Neural Network Distiller: A Python Package For DNN Compression Research Neta Zmora Guy Jacob Lev Zlotnik Bar Elharar Gal Novik 111 75 0 27 Oct 2019
Deep Learning at the Edge Sahar Voghoei N. Tonekaboni Jason G. Wallace H. Arabnia 320 47 0 22 Oct 2019
Automatic Generation of Multi-precision Multi-arithmetic CNN Accelerators for FPGAsInternational Conference on Field-Programmable Technology (ICFPT), 2019 Yiren Zhao Xitong Gao Xuan Guo Junyi Liu Erwei Wang Robert D. Mullins P. Cheung George A. Constantinides Chengzhong Xu MQ 89 31 0 21 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019 Andrey D. Ignatov Radu Timofte Andrei Kulik Seungsoo Yang Ke Wang Felix Baum Max Wu Lirong Xu Luc Van Gool ELM 143 236 0 15 Oct 2019
Bit Efficient Quantization for Deep Neural Networks Prateeth Nayak David C. Zhang S. Chai MQ 137 44 0 07 Oct 2019
QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning Srivatsan Krishnan Maximilian Lam Sharad Chitlangia Zishen Wan Gabriel Barth-Maron Aleksandra Faust Vijay Janapa Reddi MQ 203 33 0 02 Oct 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques Yiyuan Ma Li-Wen Chang Yang Chen Kefeng Deng Amit Agarwal Emad Barsoum Abe Taha MQ 98 9 0 01 Oct 2019
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller Bardienus P. Duisterhof Srivatsan Krishnan Jonathan J. Cruz Colby R. Banbury William Fu Aleksandra Faust Guido de Croon Vijay Janapa Reddi 295 29 0 25 Sep 2019
Training Deep Neural Networks Using Posit Number SystemACM Symposium on Cloud Computing (SoCC), 2019 Jinming Lu Siyuan Lu Zhisheng Wang Chao Fang Jun Lin Zhongfeng Wang Li Du MQ 97 15 0 06 Sep 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural NetworksIEEE International Conference on Computer Vision (ICCV), 2019 Yazhe Niu Xianglong Liu Shenghu Jiang Tian-Hao Li Peng Hu Jiazhen Lin F. Yu Junjie Yan MQ 232 507 0 14 Aug 2019
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge H. F. Langroudi Zachariah Carmichael David Pastuch Dhireesha Kudithipudi 135 24 0 06 Aug 2019
Scalable Multi Corpora Neural Language Models for ASRInterspeech (Interspeech), 2019 A. Raju Denis Filimonov Gautam Tiwari Guitang Lan Ariya Rastrow 98 26 0 02 Jul 2019
Visual Wake Words Dataset Aakanksha Chowdhery Pete Warden Jonathon Shlens Andrew G. Howard Rocky Rhodes VLM 152 111 0 12 Jun 2019
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference Michele Covell David Marwood S. Baluja Nick Johnston MQ 110 7 0 11 Jun 2019
Data-Free Quantization Through Weight Equalization and Bias CorrectionIEEE International Conference on Computer Vision (ICCV), 2019 Markus Nagel M. V. Baalen Tijmen Blankevoort Max Welling MQ 275 580 0 11 Jun 2019
Fighting Quantization Bias With Bias Alexander Finkelstein Uri Almog Mark Grobman MQ 154 62 0 07 Jun 2019
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On MicrocontrollersConference on Machine Learning and Systems (MLSys), 2019 Manuele Rusci Alessandro Capotondi Luca Benini MQ 157 83 0 30 May 2019
Searching for MobileNetV3IEEE International Conference on Computer Vision (ICCV), 2019 Andrew G. Howard Mark Sandler Grace Chu Liang-Chieh Chen Bo Chen ... Yukun Zhu Ruoming Pang Vijay Vasudevan Quoc V. Le Hartwig Adam 1.1K 8,168 0 06 May 2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation Bradley McDanel Shanghang Zhang H. T. Kung Xin Dong MQ 74 2 0 01 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-PrecisionIEEE International Conference on Computer Vision (ICCV), 2019 Zhen Dong Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 181 597 0 29 Apr 2019
Relay: A High-Level Compiler for Deep Learning Jared Roesch Steven Lyubomirsky Marisa Kirisame Logan Weber Josh Pollock Luis Vega Ziheng Jiang Tianqi Chen T. Moreau Zachary Tatlock 134 21 0 17 Apr 2019
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural NetworksConference on Machine Learning and Systems (MLSys), 2019 Sambhav R. Jain Albert Gural Michael Wu Chris Dick MQ 292 163 0 19 Mar 2019
Learning low-precision neural networks without Straight-Through Estimator(STE)International Joint Conference on Artificial Intelligence (IJCAI), 2019 Z. G. Liu Matthew Mattina MQ 194 37 0 04 Mar 2019
AutoQ: Automated Kernel-Wise Neural Network Quantization Qian Lou Feng Guo Lantao Liu Minje Kim Lei Jiang MQ 234 109 0 15 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization Eldad Meller Alexander Finkelstein Uri Almog Mark Grobman MQ 153 93 0 05 Feb 2019
Information-Theoretic Understanding of Population Risk Improvement with Model Compression Yuheng Bu Weihao Gao Shaofeng Zou Venugopal V. Veeravalli MedIm 96 18 0 27 Jan 2019
Efficient Winograd Convolution via Integer Arithmetic Lingchuan Meng J. Brothers 136 29 0 07 Jan 2019