ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXivPDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 464 papers shown
Title
Hybrid and Non-Uniform quantization methods using retro synthesis data
  for efficient inference
Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference
Gvsl Tej Pratap
R. Kumar
MQ
16
1
0
26 Dec 2020
Adaptive Precision Training for Resource Constrained Devices
Adaptive Precision Training for Resource Constrained Devices
Tian Huang
Tao Luo
Joey Tianyi Zhou
34
5
0
23 Dec 2020
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image
  Super-Resolution Networks
DAQ: Channel-Wise Distribution-Aware Quantization for Deep Image Super-Resolution Networks
Chee Hong
Heewon Kim
Sungyong Baik
Junghun Oh
Kyoung Mu Lee
OOD
SupR
MQ
19
40
0
21 Dec 2020
Layer Pruning via Fusible Residual Convolutional Block for Deep Neural
  Networks
Layer Pruning via Fusible Residual Convolutional Block for Deep Neural Networks
Pengtao Xu
Jian Cao
Fanhua Shang
Wenyu Sun
Pu Li
3DPC
12
24
0
29 Nov 2020
HAWQV3: Dyadic Neural Network Quantization
HAWQV3: Dyadic Neural Network Quantization
Z. Yao
Zhen Dong
Zhangcheng Zheng
A. Gholami
Jiali Yu
...
Leyuan Wang
Qijing Huang
Yida Wang
Michael W. Mahoney
Kurt Keutzer
MQ
14
87
0
20 Nov 2020
MixMix: All You Need for Data-Free Compression Are Feature and Data
  Mixing
MixMix: All You Need for Data-Free Compression Are Feature and Data Mixing
Yuhang Li
Feng Zhu
Ruihao Gong
Mingzhu Shen
Xin Dong
F. Yu
Shaoqing Lu
Shi Gu
MQ
31
38
0
19 Nov 2020
Layer-Wise Data-Free CNN Compression
Layer-Wise Data-Free CNN Compression
Maxwell Horton
Yanzi Jin
Ali Farhadi
Mohammad Rastegari
MQ
17
17
0
18 Nov 2020
Subtensor Quantization for Mobilenets
Subtensor Quantization for Mobilenets
Thu Dinh
A. Melnikov
Vasilios Daskalopoulos
S. Chai
MQ
17
4
0
04 Nov 2020
Methods for Pruning Deep Neural Networks
Methods for Pruning Deep Neural Networks
S. Vadera
Salem Ameen
3DPC
16
122
0
31 Oct 2020
INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on
  Mobile Devices
INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile Devices
Yiwu Yao
Yuchao Li
Chengyu Wang
Tianhang Yu
Houjiang Chen
...
Jun Yang
Jun Huang
Wei Lin
Hui Shu
Chengfei Lv
MQ
24
7
0
28 Oct 2020
TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems
TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems
R. David
Jared Duke
Advait Jain
Vijay Janapa Reddi
Nat Jeffries
...
Meghna Natraj
Shlomi Regev
Rocky Rhodes
Tiezhen Wang
Pete Warden
110
465
0
17 Oct 2020
Post-Training BatchNorm Recalibration
Post-Training BatchNorm Recalibration
Gil Shomron
U. Weiser
9
2
0
12 Oct 2020
Compressing Deep Convolutional Neural Networks by Stacking
  Low-dimensional Binary Convolution Filters
Compressing Deep Convolutional Neural Networks by Stacking Low-dimensional Binary Convolution Filters
Weichao Lan
Liang Lan
MQ
6
6
0
06 Oct 2020
DeepDyve: Dynamic Verification for Deep Neural Networks
DeepDyve: Dynamic Verification for Deep Neural Networks
Yu Li
Min Li
Bo Luo
Ye Tian
Qiang Xu
AAML
11
30
0
21 Sep 2020
An FPGA Accelerated Method for Training Feed-forward Neural Networks
  Using Alternating Direction Method of Multipliers and LSMR
An FPGA Accelerated Method for Training Feed-forward Neural Networks Using Alternating Direction Method of Multipliers and LSMR
Seyedeh Niusha Alavi Foumani
Ce Guo
Wayne Luk
8
3
0
06 Sep 2020
Optimal Quantization for Batch Normalization in Neural Network
  Deployments and Beyond
Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond
Dachao Lin
Peiqin Sun
Guangzeng Xie
Shuchang Zhou
Zhihua Zhang
MQ
6
2
0
30 Aug 2020
Channel-wise Hessian Aware trace-Weighted Quantization of Neural
  Networks
Channel-wise Hessian Aware trace-Weighted Quantization of Neural Networks
Xu Qian
Victor Li
Darren Crews
MQ
16
9
0
19 Aug 2020
Weight Equalizing Shift Scaler-Coupled Post-training Quantization
Weight Equalizing Shift Scaler-Coupled Post-training Quantization
Jihun Oh
Sangjeong Lee
Meejeong Park
Pooni Walagaurav
K. Kwon
MQ
18
1
0
13 Aug 2020
Leveraging Automated Mixed-Low-Precision Quantization for tiny edge
  microcontrollers
Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers
Manuele Rusci
Marco Fariselli
Alessandro Capotondi
Luca Benini
MQ
14
17
0
12 Aug 2020
Degree-Quant: Quantization-Aware Training for Graph Neural Networks
Degree-Quant: Quantization-Aware Training for Graph Neural Networks
Shyam A. Tailor
Javier Fernandez-Marques
Nicholas D. Lane
GNN
MQ
21
141
0
11 Aug 2020
Hardware-Centric AutoML for Mixed-Precision Quantization
Hardware-Centric AutoML for Mixed-Precision Quantization
Kuan-Chieh Jackson Wang
Zhijian Liu
Yujun Lin
Ji Lin
Song Han
MQ
14
14
0
11 Aug 2020
PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
Eunhyeok Park
S. Yoo
MQ
8
84
0
11 Aug 2020
Differentiable Joint Pruning and Quantization for Hardware Efficiency
Differentiable Joint Pruning and Quantization for Hardware Efficiency
Ying Wang
Yadong Lu
Tijmen Blankevoort
MQ
22
71
0
20 Jul 2020
FracBits: Mixed Precision Quantization via Fractional Bit-Widths
FracBits: Mixed Precision Quantization via Fractional Bit-Widths
Linjie Yang
Qing Jin
MQ
9
74
0
04 Jul 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML
  Models: A Survey and Insights
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
Shail Dave
Riyadh Baghdadi
Tony Nowatzki
Sasikanth Avancha
Aviral Shrivastava
Baoxin Li
46
81
0
02 Jul 2020
EasyQuant: Post-training Quantization via Scale Optimization
EasyQuant: Post-training Quantization via Scale Optimization
Di Wu
Qingming Tang
Yongle Zhao
Ming Zhang
Ying Fu
Debing Zhang
MQ
22
75
0
30 Jun 2020
Bit Error Robustness for Energy-Efficient DNN Accelerators
Bit Error Robustness for Energy-Efficient DNN Accelerators
David Stutz
Nandhini Chandramoorthy
Matthias Hein
Bernt Schiele
MQ
26
1
0
24 Jun 2020
Efficient Execution of Quantized Deep Learning Models: A Compiler
  Approach
Efficient Execution of Quantized Deep Learning Models: A Compiler Approach
Animesh Jain
Shoubhik Bhattacharya
Masahiro Masuda
Vin Sharma
Yida Wang
MQ
14
33
0
18 Jun 2020
FrostNet: Towards Quantization-Aware Network Architecture Search
FrostNet: Towards Quantization-Aware Network Architecture Search
Taehoon Kim
Y. Yoo
Jihoon Yang
MQ
19
2
0
17 Jun 2020
Quantization of Acoustic Model Parameters in Automatic Speech
  Recognition Framework
Quantization of Acoustic Model Parameters in Automatic Speech Recognition Framework
Amrutha Prasad
P. Motlícek
S. Madikeri
MQ
6
10
0
16 Jun 2020
Neural gradients are near-lognormal: improved quantized and sparse
  training
Neural gradients are near-lognormal: improved quantized and sparse training
Brian Chmiel
Liad Ben-Uri
Moran Shkolnik
Elad Hoffer
Ron Banner
Daniel Soudry
MQ
6
5
0
15 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and
  Integer Programming
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
24
122
0
14 Jun 2020
CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on
  Embedded FPGAs
CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs
Zhen Dong
Dequan Wang
Qijing Huang
Yizhao Gao
Yaohui Cai
Tian Li
Bichen Wu
Kurt Keutzer
J. Wawrzynek
ObjD
31
1
0
12 Jun 2020
Automated Design Space Exploration for optimised Deployment of DNN on
  Arm Cortex-A CPUs
Automated Design Space Exploration for optimised Deployment of DNN on Arm Cortex-A CPUs
Miguel de Prado
Andrew Mundy
Rabia Saeed
Maurizo Denna
Nuria Pazos
Luca Benini
16
11
0
09 Jun 2020
Generative Design of Hardware-aware DNNs
Generative Design of Hardware-aware DNNs
Sheng-Chun Kao
Arun Ramamurthy
T. Krishna
MQ
6
2
0
06 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Accelerating Neural Network Inference by Overflow Aware Quantization
Accelerating Neural Network Inference by Overflow Aware Quantization
Hongwei Xie
Shuo Zhang
Huanghao Ding
Yafei Song
Baitao Shao
Conggang Hu
Lingyi Cai
Mingyang Li
MQ
11
0
0
27 May 2020
Position-based Scaled Gradient for Model Quantization and Pruning
Position-based Scaled Gradient for Model Quantization and Pruning
Jangho Kim
Kiyoon Yoo
Nojun Kwak
MQ
13
7
0
22 May 2020
VecQ: Minimal Loss DNN Model Compression With Vectorized Weight
  Quantization
VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization
Cheng Gong
Yao Chen
Ye Lu
Tao Li
Cong Hao
Deming Chen
MQ
14
44
0
18 May 2020
Bayesian Bits: Unifying Quantization and Pruning
Bayesian Bits: Unifying Quantization and Pruning
M. V. Baalen
Christos Louizos
Markus Nagel
Rana Ali Amjad
Ying Wang
Tijmen Blankevoort
Max Welling
MQ
16
114
0
14 May 2020
Data-Free Network Quantization With Adversarial Knowledge Distillation
Data-Free Network Quantization With Adversarial Knowledge Distillation
Yoojin Choi
Jihwan P. Choi
Mostafa El-Khamy
Jungwon Lee
MQ
11
119
0
08 May 2020
Compact retail shelf segmentation for mobile deployment
Compact retail shelf segmentation for mobile deployment
Pratyush Kumar
Muktabh Mayank Srivastava
18
0
0
27 Apr 2020
Lite Transformer with Long-Short Range Attention
Lite Transformer with Long-Short Range Attention
Zhanghao Wu
Zhijian Liu
Ji Lin
Yujun Lin
Song Han
18
317
0
24 Apr 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
10
550
0
22 Apr 2020
A Data and Compute Efficient Design for Limited-Resources Deep Learning
A Data and Compute Efficient Design for Limited-Resources Deep Learning
Mirgahney Mohamed
Gabriele Cesa
Taco S. Cohen
Max Welling
MedIm
24
18
0
21 Apr 2020
Integer Quantization for Deep Learning Inference: Principles and
  Empirical Evaluation
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation
Hao Wu
Patrick Judd
Xiaojie Zhang
Mikhail Isaev
Paulius Micikevicius
MQ
32
340
0
20 Apr 2020
LSQ+: Improving low-bit quantization through learnable offsets and
  better initialization
LSQ+: Improving low-bit quantization through learnable offsets and better initialization
Yash Bhalgat
Jinwon Lee
Markus Nagel
Tijmen Blankevoort
Nojun Kwak
MQ
20
212
0
20 Apr 2020
Non-Blocking Simultaneous Multithreading: Embracing the Resiliency of
  Deep Neural Networks
Non-Blocking Simultaneous Multithreading: Embracing the Resiliency of Deep Neural Networks
Gil Shomron
U. Weiser
6
14
0
17 Apr 2020
Training with Quantization Noise for Extreme Model Compression
Training with Quantization Noise for Extreme Model Compression
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Hervé Jégou
Armand Joulin
MQ
22
243
0
15 Apr 2020
CNN2Gate: Toward Designing a General Framework for Implementation of
  Convolutional Neural Networks on FPGA
CNN2Gate: Toward Designing a General Framework for Implementation of Convolutional Neural Networks on FPGA
Alireza Ghaffari
Yvon Savaria
12
9
0
06 Apr 2020
Previous
123...10789
Next