Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

15 December 2017

Papers citing "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"

50 / 1,255 papers shown

Title
Training convolutional neural networks with cheap convolutions and online distillation Jiao Xie Shaohui Lin Yichen Zhang Linkai Luo 19 12 0 28 Sep 2019
Optimizing Speech Recognition For The Edge Yuan Shangguan Jian Li Qiao Liang R. Álvarez Ian McGraw 20 64 0 26 Sep 2019
Balanced Binary Neural Networks with Gated Residual Mingzhu Shen Xianglong Liu Ruihao Gong Kai Han MQ 9 33 0 26 Sep 2019
Structured Binary Neural Networks for Image Recognition Bohan Zhuang Chunhua Shen Mingkui Tan Peng Chen Lingqiao Liu Ian Reid MQ 22 17 0 22 Sep 2019
Density Encoding Enables Resource-Efficient Randomly Connected Neural Networks Denis Kleyko Mansour Kheffache E. P. Frady U. Wiklund Evgeny Osipov 19 45 0 19 Sep 2019
CrypTFlow: Secure TensorFlow Inference Nishant Kumar Mayank Rathee Nishanth Chandran Divya Gupta Aseem Rastogi Rahul Sharma 96 235 0 16 Sep 2019
Neural Machine Translation with 4-Bit Precision and Beyond Alham Fikri Aji Kenneth Heafield MQ 8 7 0 13 Sep 2019
Differentiable Mask for Pruning Convolutional and Recurrent Networks R. Ramakrishnan Eyyub Sari V. Nia VLM 32 15 0 10 Sep 2019
PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors Angelo Garofalo Manuele Rusci Francesco Conti D. Rossi Luca Benini MQ 6 134 0 29 Aug 2019
Real-time Person Re-identification at the Edge: A Mixed Precision Approach Mohammadreza Baharani Shrey Mohan Hamed Tabkhi 24 10 0 19 Aug 2019
Adaptative Inference Cost With Convolutional Neural Mixture Models Adria Ruiz Jakob Verbeek VLM 24 22 0 19 Aug 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks Ruihao Gong Xianglong Liu Shenghu Jiang Tian-Hao Li Peng Hu Jiazhen Lin F. Yu Junjie Yan MQ 21 445 0 14 Aug 2019
Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations Bohan Zhuang Jing Liu Mingkui Tan Lingqiao Liu Ian Reid Chunhua Shen MQ 26 44 0 10 Aug 2019
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge H. F. Langroudi Zachariah Carmichael David Pastuch Dhireesha Kudithipudi 14 24 0 06 Aug 2019
Deep Learning Training on the Edge with Low-Precision Posits H. F. Langroudi Zachariah Carmichael Dhireesha Kudithipudi MQ 16 14 0 30 Jul 2019
Similarity-Preserving Knowledge Distillation Frederick Tung Greg Mori 39 957 0 23 Jul 2019
Batch-Shaping for Learning Conditional Channel Gated Networks B. Bejnordi Tijmen Blankevoort Max Welling AI4CE 20 76 0 15 Jul 2019
Neural Epitome Search for Architecture-Agnostic Network Compression Daquan Zhou Xiaojie Jin Qibin Hou Kaixin Wang Jianchao Yang Jiashi Feng 21 13 0 12 Jul 2019
Template-Based Posit Multiplication for Training and Inferring in Neural Networks Raul Murillo Alberto A. Del Barrio Guillermo Botella Juan 11 16 0 09 Jul 2019
Data-Independent Neural Pruning via Coresets Ben Mussay Margarita Osadchy Vladimir Braverman Samson Zhou Dan Feldman 6 60 0 09 Jul 2019
QUOTIENT: Two-Party Secure Neural Network Training and Prediction Nitin Agrawal Ali Shahin Shamsabadi Matt J. Kusner Adria Gascon 22 212 0 08 Jul 2019
Weight Normalization based Quantization for Deep Neural Network Compression Wenhong Cai Wu-Jun Li 16 14 0 01 Jul 2019
GAN-Knowledge Distillation for one-stage Object Detection Wanwei Wang Jin ke Yu Fan Zong ObjD 14 28 0 20 Jun 2019
A One-step Pruning-recovery Framework for Acceleration of Convolutional Neural Networks Dong Wang Lei Zhou Xiao Bai Jun Zhou 9 2 0 18 Jun 2019
Visual Wake Words Dataset Aakanksha Chowdhery Pete Warden Jonathon Shlens Andrew G. Howard Rocky Rhodes VLM 16 98 0 12 Jun 2019
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference Michele Covell David Marwood S. Baluja Nick Johnston MQ 11 7 0 11 Jun 2019
Data-Free Quantization Through Weight Equalization and Bias Correction Markus Nagel M. V. Baalen Tijmen Blankevoort Max Welling MQ 19 499 0 11 Jun 2019
DiCENet: Dimension-wise Convolutions for Efficient Networks Sachin Mehta Hannaneh Hajishirzi Mohammad Rastegari 27 43 0 08 Jun 2019
Fighting Quantization Bias With Bias Alexander Finkelstein Uri Almog Mark Grobman MQ 14 56 0 07 Jun 2019
Addressing Limited Weight Resolution in a Fully Optical Neuromorphic Reservoir Computing Readout Chonghuai Ma Floris Laporte J. Dambre P. Bienstman 6 10 0 06 Jun 2019
DeepShift: Towards Multiplication-Less Neural Networks Mostafa Elhoushi Zihao Chen F. Shafiq Ye Tian Joey Yiwei Li MQ 33 97 0 30 May 2019
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers Manuele Rusci Alessandro Capotondi Luca Benini MQ 17 74 0 30 May 2019
RecNets: Channel-wise Recurrent Convolutional Neural Networks George Retsinas Athena Elafrou G. Goumas Petros Maragos 13 2 0 28 May 2019
CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks Weicheng Li Rui Wang Zhongzhi Luan Di Huang Zidong Du Yunji Chen D. Qian 12 1 0 28 May 2019
OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks Jiashi Li Q. Qi Jingyu Wang Ce Ge Yujian Betterest Li Zhangzhang Yue Haifeng Sun BDL CML 19 53 0 28 May 2019
Seeing Convolution Through the Eyes of Finite Transformation Semigroup Theory: An Abstract Algebraic Interpretation of Convolutional Neural Networks Andrew Hryniowski A. Wong 6 0 0 26 May 2019
Feature Map Transform Coding for Energy-Efficient CNN Inference Brian Chmiel Chaim Baskin Ron Banner Evgenii Zheltonozhskii Yevgeny Yermolin Alex Karbachevsky A. Bronstein A. Mendelson 12 24 0 26 May 2019
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis Chaoqi Wang Roger C. Grosse Sanja Fidler Guodong Zhang 21 121 0 15 May 2019
EdgeSegNet: A Compact Network for Semantic Segmentation Z. Q. Lin Brendan Chwyl A. Wong SSeg 17 9 0 10 May 2019
Seesaw-Net: Convolution Neural Network With Uneven Group Convolution Jintao Zhang BDL 20 7 0 09 May 2019
Searching for MobileNetV3 Andrew G. Howard Mark Sandler Grace Chu Liang-Chieh Chen Bo Chen ... Yukun Zhu Ruoming Pang Vijay Vasudevan Quoc V. Le Hartwig Adam 41 6,600 0 06 May 2019
Creating Lightweight Object Detectors with Model Compression for Deployment on Edge Devices Yiwu Yao Weiqiang Yang Haoqi Zhu 21 0 0 06 May 2019
Parity Models: A General Framework for Coding-Based Resilience in ML Inference J. Kosaian K. V. Rashmi Shivaram Venkataraman 6 14 0 02 May 2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation Bradley McDanel S. Zhang H. T. Kung Xin Dong MQ 14 2 0 01 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision Zhen Dong Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 19 513 0 29 Apr 2019
Towards Efficient Model Compression via Learned Global Ranking Ting-Wu Chin Ruizhou Ding Cha Zhang Diana Marculescu 16 170 0 28 Apr 2019
Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks Y. Zur Chaim Baskin Evgenii Zheltonozhskii Brian Chmiel Itay Evron A. Bronstein A. Mendelson MQ 26 7 0 22 Apr 2019
Defensive Quantization: When Efficiency Meets Robustness Ji Lin Chuang Gan Song Han MQ 34 201 0 17 Apr 2019
Towards Real-Time Automatic Portrait Matting on Mobile Devices Seokjun Seo Seungwoo Choi Martin Kersner Beomjun Shin Hyungsuk Yoon Hyeongmin Byun S. Ha 3DH 6 3 0 08 Apr 2019
Progressive Stochastic Binarization of Deep Networks David Hartmann Michael Wand MQ 12 1 0 03 Apr 2019