Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,255 papers shown

Title
Boosted and Differentially Private Ensembles of Decision Trees Richard Nock Wilko Henecka 6 2 0 26 Jan 2020
The Two-Pass Softmax Algorithm Marat Dukhan Artsiom Ablavatski TPM 11 8 0 13 Jan 2020
Least squares binary quantization of neural networks Hadi Pouransari Zhucheng Tu Oncel Tuzel MQ 17 32 0 09 Jan 2020
Resource-Efficient Neural Networks for Embedded Systems Wolfgang Roth Günther Schindler Lukas Pfeifenberger Robert Peharz Sebastian Tschiatschek Holger Fröning Franz Pernkopf Zoubin Ghahramani 26 47 0 07 Jan 2020
Sparse Weight Activation Training Md Aamir Raihan Tor M. Aamodt 32 72 0 07 Jan 2020
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference Jianghao Shen Y. Fu Yue Wang Pengfei Xu Zhangyang Wang Yingyan Lin MQ 22 44 0 03 Jan 2020
Lightweight Residual Densely Connected Convolutional Neural Network Fahimeh Fooladgar S. Kasaei 14 13 0 02 Jan 2020
ZeroQ: A Novel Zero Shot Quantization Framework Yaohui Cai Z. Yao Zhen Dong A. Gholami Michael W. Mahoney Kurt Keutzer MQ 30 389 0 01 Jan 2020
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection Tianshu Chu Qin Luo Jie-jin Yang Xiaolin Huang MQ 16 6 0 29 Dec 2019
Towards Unified INT8 Training for Convolutional Neural Network Feng Zhu Ruihao Gong F. Yu Xianglong Liu Yanfei Wang Zhelong Li Xiuqi Yang Junjie Yan MQ 27 150 0 29 Dec 2019
Towards Efficient Training for Neural Network Quantization Qing Jin Linjie Yang Zhenyu A. Liao MQ 11 42 0 21 Dec 2019
AdaBits: Neural Network Quantization with Adaptive Bit-Widths Qing Jin Linjie Yang Zhenyu A. Liao MQ 16 123 0 20 Dec 2019
Predicting detection filters for small footprint open-vocabulary keyword spotting Théodore Bluche Thibault Gisselbrecht ObjD 18 19 0 16 Dec 2019
The Knowledge Within: Methods for Data-Free Model Compression Matan Haroush Itay Hubara Elad Hoffer Daniel Soudry 20 105 0 03 Dec 2019
ReD-CaNe: A Systematic Methodology for Resilience Analysis and Design of Capsule Networks under Approximations Alberto Marchisio Vojtěch Mrázek Muhammad Abdullah Hanif Muhammad Shafique AAML 6 15 0 02 Dec 2019
Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization J. H. Lee Jihun Yun Sung Ju Hwang Eunho Yang MQ 15 0 0 29 Nov 2019
GhostNet: More Features from Cheap Operations Kai Han Yunhe Wang Qi Tian Jianyuan Guo Chunjing Xu Chang Xu 20 2,575 0 27 Nov 2019
Structured Multi-Hashing for Model Compression Elad Eban Yair Movshovitz-Attias Hao Wu Mark Sandler Andrew Poon Yerlan Idelbayev M. A. Carreira-Perpiñán 9 18 0 25 Nov 2019
Quantization Networks Jiwei Yang Xu Shen Jun Xing Xinmei Tian Houqiang Li Bing Deng Jianqiang Huang Xiansheng Hua MQ 25 338 0 21 Nov 2019
REVAMP $^2$ T: Real-time Edge Video Analytics for Multi-camera Privacy-aware Pedestrian Tracking Christopher Neff Matías Mendieta Shrey Mohan Mohammadreza Baharani Samuel Rogers Hamed Tabkhi 16 56 0 20 Nov 2019
CUP: Cluster Pruning for Compressing Deep Neural Networks Rahul Duggal Cao Xiao R. Vuduc Jimeng Sun 3DPC VLM 16 22 0 19 Nov 2019
Distributed Low Precision Training Without Mixed Precision Zehua Cheng Weiyan Wang Yan Pan Thomas Lukasiewicz MQ 18 5 0 18 Nov 2019
Selective sampling for accelerating training of deep neural networks Berry Weinstein Shai Fine Y. Hel-Or 11 3 0 16 Nov 2019
DupNet: Towards Very Tiny Quantized CNN with Improved Accuracy for Face Detection Hongxing Gao Wei Tao Dongchao Wen Junjie Liu Tse-Wei Chen Kinya Osa Masami Kato CVBM 19 5 0 13 Nov 2019
Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation Junjie Liu Dongchao Wen Hongxing Gao Wei Tao Tse-Wei Chen Kinya Osa Masami Kato 22 21 0 13 Nov 2019
What Do Compressed Deep Neural Networks Forget? Sara Hooker Aaron Courville Gregory Clark Yann N. Dauphin Andrea Frome 17 181 0 13 Nov 2019
Scientific Image Restoration Anywhere V. Abeykoon Zhengchun Liu R. Kettimuthu Geoffrey C. Fox Ian T. Foster 19 19 0 12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks Zhen Dong Z. Yao Yaohui Cai Daiyaan Arfeen A. Gholami Michael W. Mahoney Kurt Keutzer MQ 26 274 0 10 Nov 2019
Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection Vicent Sanz Marco Ben Taylor Z. Wang Y. Elkhatib 22 60 0 09 Nov 2019
A Simplified Fully Quantized Transformer for End-to-end Speech Recognition Alex Bie Bharat Venkitesh João Monteiro Md. Akmal Haidar Mehdi Rezagholizadeh MQ 24 27 0 09 Nov 2019
On-Device Machine Learning: An Algorithms and Learning Theory Perspective Sauptik Dhar Junyao Guo Jiayi Liu S. Tripathi Unmesh Kurup Mohak Shah 17 141 0 02 Nov 2019
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers Xishan Zhang Shaoli Liu Rui Zhang Chang-Shu Liu Di Huang ... Jiaming Guo Yu Kang Qi Guo Zidong Du Yunji Chen MQ 13 6 0 01 Nov 2019
On Distributed Quantization for Classification Osama A. Hanna Yahya H. Ezzeldin Tara Sadjadpour Christina Fragouli Suhas Diggavi MQ 14 14 0 01 Nov 2019
In-Place Zero-Space Memory Protection for CNN Hui Guan Lin Ning Zhen Lin Xipeng Shen Huiyang Zhou Seung-Hwan Lim 11 28 0 31 Oct 2019
Secure Evaluation of Quantized Neural Networks Anders Dalskov Daniel E. Escudero Marcel Keller 12 137 0 28 Oct 2019
Neural Network Distiller: A Python Package For DNN Compression Research Neta Zmora Guy Jacob Lev Zlotnik Bar Elharar Gal Novik 17 73 0 27 Oct 2019
Reversible designs for extreme memory cost reduction of CNN training T. Hascoet Q. Febvre Y. Ariki T. Takiguchi 3DV 6 2 0 24 Oct 2019
Fully Quantized Transformer for Machine Translation Gabriele Prato Ella Charlaix Mehdi Rezagholizadeh MQ 13 68 0 17 Oct 2019
Neural Network Design for Energy-Autonomous AI Applications using Temporal Encoding S. Mileiko Thanasin Bunnam F. Xia R. Shafik Alex Yakovlev Shidhartha Das 14 0 0 15 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019 Andrey D. Ignatov Radu Timofte Andrei Kulik Seungsoo Yang Ke Wang Felix Baum Max Wu Lirong Xu Luc Van Gool ELM 13 218 0 15 Oct 2019
Q8BERT: Quantized 8Bit BERT Ofir Zafrir Guy Boudoukh Peter Izsak Moshe Wasserblat MQ 9 500 0 14 Oct 2019
Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-based Approach Haichuan Yang Shupeng Gui Yuhao Zhu Ji Liu MQ 20 5 0 14 Oct 2019
OverQ: Opportunistic Outlier Quantization for Neural Network Accelerators Ritchie Zhao Jordan Dotzel Zhanqiu Hu Preslav Ivanov Christopher De Sa Zhiru Zhang MQ 22 1 0 13 Oct 2019
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM Skanda Koppula Lois Orosa A. G. Yaglikçi Roknoddin Azizi Taha Shahroodi Konstantinos Kanellopoulos O. Mutlu 19 105 0 12 Oct 2019
QPyTorch: A Low-Precision Arithmetic Simulation Framework Tianyi Zhang Zhiqiu Lin Guandao Yang Christopher De Sa MQ 21 64 0 09 Oct 2019
Bit Efficient Quantization for Deep Neural Networks Prateeth Nayak David C. Zhang S. Chai MQ 25 43 0 07 Oct 2019
Neural networks on microcontrollers: saving memory at inference via operator reordering Edgar Liberis Nicholas D. Lane 11 46 0 02 Oct 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques Wenlei Bao Li-Wen Chang Yang Chen Kefeng Deng Amit Agarwal Emad Barsoum Abe Taha MQ 11 7 0 01 Oct 2019
Automated design of error-resilient and hardware-efficient deep neural networks Christoph Schorn T. Elsken Sebastian Vogel Armin Runge A. Guntoro G. Ascheid AAML 17 32 0 30 Sep 2019
AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference Thierry Tambe En-Yu Yang Zishen Wan Yuntian Deng Vijay Janapa Reddi Alexander M. Rush David Brooks Gu-Yeon Wei MQ 11 21 0 29 Sep 2019