Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 464 papers shown

Title
Binary Neural Networks: A Survey Haotong Qin Ruihao Gong Xianglong Liu Xiao Bai Jingkuan Song N. Sebe MQ 50 457 0 31 Mar 2020
AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures Qianlin Liang Prashant J. Shenoy David E. Irwin 9 17 0 27 Mar 2020
Compiling Neural Networks for a Computational Memory Accelerator K. Kourtis M. Dazzi Nikolas Ioannou Tobias Grosser A. Sebastian E. Eleftheriou 12 5 0 05 Mar 2020
Searching for Winograd-aware Quantized Networks Javier Fernandez-Marques P. Whatmough Andrew Mundy Matthew Mattina MQ 11 40 0 25 Feb 2020
Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision Xingchao Liu Mao Ye Dengyong Zhou Qiang Liu MQ 8 42 0 20 Feb 2020
Neural Network Compression Framework for fast model inference Alexander Kozlov Ivan Lazarevich Vasily Shamporov N. Lyalyushkin Yury Gorbachev 23 35 0 20 Feb 2020
SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization Lukas Enderich Fabian Timm Wolfram Burgard MQ 14 6 0 19 Feb 2020
Robust Quantization: One Model to Rule Them All Moran Shkolnik Brian Chmiel Ron Banner Gil Shomron Yury Nahshan A. Bronstein U. Weiser OOD MQ 14 75 0 18 Feb 2020
$Gradient $\ell_1$ Regularization for Quantization Robustness$ Gradient $\ell_1$ Regularization for Quantization Robustness Milad Alizadeh Arash Behboodi M. V. Baalen Christos Louizos Tijmen Blankevoort Max Welling MQ 12 8 0 18 Feb 2020
Post-Training Piecewise Linear Quantization for Deep Neural Networks Jun Fang Ali Shafiee Hamzah Abdel-Aziz D. Thorsley Georgios Georgiadis Joseph Hassoun MQ 12 144 0 31 Jan 2020
Quantisation and Pruning for Neural Network Compression and Regularisation Kimessha Paupamah Steven D. James Richard Klein 9 23 0 14 Jan 2020
ZeroQ: A Novel Zero Shot Quantization Framework Yaohui Cai Z. Yao Zhen Dong A. Gholami Michael W. Mahoney Kurt Keutzer MQ 30 389 0 01 Jan 2020
Towards Unified INT8 Training for Convolutional Neural Network Feng Zhu Ruihao Gong F. Yu Xianglong Liu Yanfei Wang Zhelong Li Xiuqi Yang Junjie Yan MQ 27 151 0 29 Dec 2019
Towards Efficient Training for Neural Network Quantization Qing Jin Linjie Yang Zhenyu A. Liao MQ 11 42 0 21 Dec 2019
Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks Andrey Kuzmin Markus Nagel Saurabh Pitre Sandeep Pendyam Tijmen Blankevoort Max Welling 9 27 0 20 Dec 2019
Learned Variable-Rate Image Compression with Residual Divisive Normalization Mohammad Akbari Jie Liang Jingning Han Chengjie Tu 19 25 0 11 Dec 2019
The Knowledge Within: Methods for Data-Free Model Compression Matan Haroush Itay Hubara Elad Hoffer Daniel Soudry 18 105 0 03 Dec 2019
QKD: Quantization-aware Knowledge Distillation Jangho Kim Yash Bhalgat Jinwon Lee Chirag I. Patel Nojun Kwak MQ 16 63 0 28 Nov 2019
Loss Aware Post-training Quantization Yury Nahshan Brian Chmiel Chaim Baskin Evgenii Zheltonozhskii Ron Banner A. Bronstein A. Mendelson MQ 26 163 0 17 Nov 2019
Scientific Image Restoration Anywhere V. Abeykoon Zhengchun Liu R. Kettimuthu Geoffrey C. Fox Ian T. Foster 16 19 0 12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks Zhen Dong Z. Yao Yaohui Cai Daiyaan Arfeen A. Gholami Michael W. Mahoney Kurt Keutzer MQ 26 274 0 10 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables Hui Guan Andrey Malevich Jiyan Yang Jongsoo Park Hector Yuen MQ 11 32 0 05 Nov 2019
Training DNN IoT Applications for Deployment On Analog NVM Crossbars F. García-Redondo Shidhartha Das G. Rosendale 17 5 0 30 Oct 2019
Secure Evaluation of Quantized Neural Networks Anders Dalskov Daniel E. Escudero Marcel Keller 12 137 0 28 Oct 2019
Neural Network Distiller: A Python Package For DNN Compression Research Neta Zmora Guy Jacob Lev Zlotnik Bar Elharar Gal Novik 17 73 0 27 Oct 2019
Deep Learning at the Edge Sahar Voghoei N. Tonekaboni Jason G. Wallace H. Arabnia 11 41 0 22 Oct 2019
Automatic Generation of Multi-precision Multi-arithmetic CNN Accelerators for FPGAs Yiren Zhao Xitong Gao Xuan Guo Junyi Liu Erwei Wang Robert D. Mullins P. Cheung G. Constantinides Chengzhong Xu MQ 19 31 0 21 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019 Andrey D. Ignatov Radu Timofte Andrei Kulik Seungsoo Yang Ke Wang Felix Baum Max Wu Lirong Xu Luc Van Gool ELM 13 218 0 15 Oct 2019
Bit Efficient Quantization for Deep Neural Networks Prateeth Nayak David C. Zhang S. Chai MQ 25 43 0 07 Oct 2019
QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning Srivatsan Krishnan Maximilian Lam Sharad Chitlangia Zishen Wan Gabriel Barth-Maron Aleksandra Faust Vijay Janapa Reddi MQ 21 22 0 02 Oct 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques Wenlei Bao Li-Wen Chang Yang Chen Kefeng Deng Amit Agarwal Emad Barsoum Abe Taha MQ 11 7 0 01 Oct 2019
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller Bardienus P. Duisterhof Srivatsan Krishnan Jonathan J. Cruz Colby R. Banbury William Fu Aleksandra Faust Guido de Croon Vijay Janapa Reddi 18 25 0 25 Sep 2019
Training Deep Neural Networks Using Posit Number System Jinming Lu Siyuan Lu Zhisheng Wang Chao Fang Jun Lin Zhongfeng Wang Li Du MQ 19 13 0 06 Sep 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks Ruihao Gong Xianglong Liu Shenghu Jiang Tian-Hao Li Peng Hu Jiazhen Lin F. Yu Junjie Yan MQ 21 445 0 14 Aug 2019
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge H. F. Langroudi Zachariah Carmichael David Pastuch Dhireesha Kudithipudi 14 24 0 06 Aug 2019
Scalable Multi Corpora Neural Language Models for ASR A. Raju Denis Filimonov Gautam Tiwari Guitang Lan Ariya Rastrow 11 26 0 02 Jul 2019
Visual Wake Words Dataset Aakanksha Chowdhery Pete Warden Jonathon Shlens Andrew G. Howard Rocky Rhodes VLM 16 98 0 12 Jun 2019
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference Michele Covell David Marwood S. Baluja Nick Johnston MQ 11 7 0 11 Jun 2019
Data-Free Quantization Through Weight Equalization and Bias Correction Markus Nagel M. V. Baalen Tijmen Blankevoort Max Welling MQ 19 499 0 11 Jun 2019
Fighting Quantization Bias With Bias Alexander Finkelstein Uri Almog Mark Grobman MQ 14 56 0 07 Jun 2019
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers Manuele Rusci Alessandro Capotondi Luca Benini MQ 17 74 0 30 May 2019
Searching for MobileNetV3 Andrew G. Howard Mark Sandler Grace Chu Liang-Chieh Chen Bo Chen ... Yukun Zhu Ruoming Pang Vijay Vasudevan Quoc V. Le Hartwig Adam 41 6,600 0 06 May 2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation Bradley McDanel S. Zhang H. T. Kung Xin Dong MQ 14 2 0 01 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision Zhen Dong Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 19 513 0 29 Apr 2019
Relay: A High-Level Compiler for Deep Learning Jared Roesch Steven Lyubomirsky Marisa Kirisame Logan Weber Josh Pollock Luis Vega Ziheng Jiang Tianqi Chen T. Moreau Zachary Tatlock 20 21 0 17 Apr 2019
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks Sambhav R. Jain Albert Gural Michael Wu Chris Dick MQ 13 147 0 19 Mar 2019
Learning low-precision neural networks without Straight-Through Estimator(STE) Z. G. Liu Matthew Mattina MQ 19 34 0 04 Mar 2019
AutoQ: Automated Kernel-Wise Neural Network Quantization Qian Lou Feng Guo Lantao Liu Minje Kim Lei Jiang MQ 16 97 0 15 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization Eldad Meller Alexander Finkelstein Uri Almog Mark Grobman MQ 16 85 0 05 Feb 2019
Information-Theoretic Understanding of Population Risk Improvement with Model Compression Yuheng Bu Weihao Gao Shaofeng Zou V. Veeravalli MedIm 8 15 0 27 Jan 2019