ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXiv (abs)PDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 513 papers shown
Title
AI on the Edge: Rethinking AI-based IoT Applications Using Specialized
  Edge Architectures
AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures
Qianlin Liang
Prashant J. Shenoy
David Irwin
126
21
0
27 Mar 2020
Compiling Neural Networks for a Computational Memory Accelerator
Compiling Neural Networks for a Computational Memory Accelerator
K. Kourtis
M. Dazzi
Nikolas Ioannou
Tobias Grosser
Abu Sebastian
E. Eleftheriou
103
5
0
05 Mar 2020
Searching for Winograd-aware Quantized Networks
Searching for Winograd-aware Quantized NetworksConference on Machine Learning and Systems (MLSys), 2020
Javier Fernandez-Marques
P. Whatmough
Andrew Mundy
Matthew Mattina
MQ
116
40
0
25 Feb 2020
Post-training Quantization with Multiple Points: Mixed Precision without
  Mixed Precision
Post-training Quantization with Multiple Points: Mixed Precision without Mixed PrecisionAAAI Conference on Artificial Intelligence (AAAI), 2020
Xingchao Liu
Mao Ye
Dengyong Zhou
Qiang Liu
MQ
239
51
0
20 Feb 2020
Neural Network Compression Framework for fast model inference
Neural Network Compression Framework for fast model inference
Alexander Kozlov
Ivan Lazarevich
Vasily Shamporov
N. Lyalyushkin
Yury Gorbachev
253
38
0
20 Feb 2020
SYMOG: learning symmetric mixture of Gaussian modes for improved
  fixed-point quantization
SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantizationNeurocomputing (Neurocomputing), 2020
Lukas Enderich
Fabian Timm
Wolfram Burgard
MQ
88
6
0
19 Feb 2020
Robust Quantization: One Model to Rule Them All
Robust Quantization: One Model to Rule Them AllNeural Information Processing Systems (NeurIPS), 2020
Moran Shkolnik
Brian Chmiel
Ron Banner
Gil Shomron
Yury Nahshan
A. Bronstein
U. Weiser
OODMQ
187
88
0
18 Feb 2020
Gradient $\ell_1$ Regularization for Quantization Robustness
Gradient ℓ1\ell_1ℓ1​ Regularization for Quantization RobustnessInternational Conference on Learning Representations (ICLR), 2020
Milad Alizadeh
Arash Behboodi
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
Max Welling
MQ
148
8
0
18 Feb 2020
Post-Training Piecewise Linear Quantization for Deep Neural Networks
Post-Training Piecewise Linear Quantization for Deep Neural NetworksEuropean Conference on Computer Vision (ECCV), 2020
Jun Fang
Ali Shafiee
Hamzah Abdel-Aziz
D. Thorsley
Georgios Georgiadis
Joseph Hassoun
MQ
341
170
0
31 Jan 2020
Quantisation and Pruning for Neural Network Compression and
  Regularisation
Quantisation and Pruning for Neural Network Compression and Regularisation
Kimessha Paupamah
Steven D. James
Richard Klein
75
25
0
14 Jan 2020
ZeroQ: A Novel Zero Shot Quantization Framework
ZeroQ: A Novel Zero Shot Quantization FrameworkComputer Vision and Pattern Recognition (CVPR), 2020
Yaohui Cai
Z. Yao
Zhen Dong
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
221
452
0
01 Jan 2020
Towards Unified INT8 Training for Convolutional Neural Network
Towards Unified INT8 Training for Convolutional Neural NetworkComputer Vision and Pattern Recognition (CVPR), 2019
Feng Zhu
Yazhe Niu
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
203
172
0
29 Dec 2019
Towards Efficient Training for Neural Network Quantization
Towards Efficient Training for Neural Network Quantization
Qing Jin
Linjie Yang
Zhenyu A. Liao
MQ
223
42
0
21 Dec 2019
Taxonomy and Evaluation of Structured Compression of Convolutional
  Neural Networks
Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
Andrey Kuzmin
Markus Nagel
Saurabh Pitre
Sandeep Pendyam
Tijmen Blankevoort
Max Welling
113
27
0
20 Dec 2019
Learned Variable-Rate Image Compression with Residual Divisive
  Normalization
Learned Variable-Rate Image Compression with Residual Divisive NormalizationIEEE International Conference on Multimedia and Expo (ICME), 2019
Mohammad Akbari
Jie Liang
Jingning Han
Chengjie Tu
106
26
0
11 Dec 2019
The Knowledge Within: Methods for Data-Free Model Compression
The Knowledge Within: Methods for Data-Free Model CompressionComputer Vision and Pattern Recognition (CVPR), 2019
Matan Haroush
Itay Hubara
Elad Hoffer
Daniel Soudry
198
113
0
03 Dec 2019
QKD: Quantization-aware Knowledge Distillation
QKD: Quantization-aware Knowledge Distillation
Jangho Kim
Brandon Smart
Jinwon Lee
Chirag I. Patel
Nojun Kwak
MQ
193
72
0
28 Nov 2019
Loss Aware Post-training Quantization
Loss Aware Post-training QuantizationMachine-mediated learning (ML), 2019
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
297
185
0
17 Nov 2019
Scientific Image Restoration Anywhere
Scientific Image Restoration AnywhereAnnual Workshop on Large-scale Experiment-in-the-Loop Computing (ALEC), 2019
V. Abeykoon
Zhengchun Liu
R. Kettimuthu
Geoffrey C. Fox
Ian Foster
159
19
0
12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural NetworksNeural Information Processing Systems (NeurIPS), 2019
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
177
324
0
10 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables
Post-Training 4-bit Quantization on Embedding Tables
Hui Guan
Andrey Malevich
Jiyan Yang
Jongsoo Park
Hector Yuen
MQ
143
43
0
05 Nov 2019
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
Training DNN IoT Applications for Deployment On Analog NVM CrossbarsIEEE International Joint Conference on Neural Network (IJCNN), 2019
F. García-Redondo
Shidhartha Das
G. Rosendale
213
5
0
30 Oct 2019
Secure Evaluation of Quantized Neural Networks
Secure Evaluation of Quantized Neural NetworksIACR Cryptology ePrint Archive (IACR ePrint), 2019
Anders Dalskov
Daniel E. Escudero
Marcel Keller
288
148
0
28 Oct 2019
Neural Network Distiller: A Python Package For DNN Compression Research
Neural Network Distiller: A Python Package For DNN Compression Research
Neta Zmora
Guy Jacob
Lev Zlotnik
Bar Elharar
Gal Novik
111
75
0
27 Oct 2019
Deep Learning at the Edge
Deep Learning at the Edge
Sahar Voghoei
N. Tonekaboni
Jason G. Wallace
H. Arabnia
320
47
0
22 Oct 2019
Automatic Generation of Multi-precision Multi-arithmetic CNN
  Accelerators for FPGAs
Automatic Generation of Multi-precision Multi-arithmetic CNN Accelerators for FPGAsInternational Conference on Field-Programmable Technology (ICFPT), 2019
Yiren Zhao
Xitong Gao
Xuan Guo
Junyi Liu
Erwei Wang
Robert D. Mullins
P. Cheung
George A. Constantinides
Chengzhong Xu
MQ
89
31
0
21 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019
Andrey D. Ignatov
Radu Timofte
Andrei Kulik
Seungsoo Yang
Ke Wang
Felix Baum
Max Wu
Lirong Xu
Luc Van Gool
ELM
143
236
0
15 Oct 2019
Bit Efficient Quantization for Deep Neural Networks
Bit Efficient Quantization for Deep Neural Networks
Prateeth Nayak
David C. Zhang
S. Chai
MQ
137
44
0
07 Oct 2019
QuaRL: Quantization for Fast and Environmentally Sustainable
  Reinforcement Learning
QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning
Srivatsan Krishnan
Maximilian Lam
Sharad Chitlangia
Zishen Wan
Gabriel Barth-Maron
Aleksandra Faust
Vijay Janapa Reddi
MQ
203
33
0
02 Oct 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques
Yiyuan Ma
Li-Wen Chang
Yang Chen
Kefeng Deng
Amit Agarwal
Emad Barsoum
Abe Taha
MQ
98
9
0
01 Oct 2019
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement
  Learning Onboard a Nano Drone Microcontroller
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller
Bardienus P. Duisterhof
Srivatsan Krishnan
Jonathan J. Cruz
Colby R. Banbury
William Fu
Aleksandra Faust
Guido de Croon
Vijay Janapa Reddi
295
29
0
25 Sep 2019
Training Deep Neural Networks Using Posit Number System
Training Deep Neural Networks Using Posit Number SystemACM Symposium on Cloud Computing (SoCC), 2019
Jinming Lu
Siyuan Lu
Zhisheng Wang
Chao Fang
Jun Lin
Zhongfeng Wang
Li Du
MQ
97
15
0
06 Sep 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit
  Neural Networks
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural NetworksIEEE International Conference on Computer Vision (ICCV), 2019
Yazhe Niu
Xianglong Liu
Shenghu Jiang
Tian-Hao Li
Peng Hu
Jiazhen Lin
F. Yu
Junjie Yan
MQ
232
507
0
14 Aug 2019
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for
  DNNs on the Edge
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge
H. F. Langroudi
Zachariah Carmichael
David Pastuch
Dhireesha Kudithipudi
135
24
0
06 Aug 2019
Scalable Multi Corpora Neural Language Models for ASR
Scalable Multi Corpora Neural Language Models for ASRInterspeech (Interspeech), 2019
A. Raju
Denis Filimonov
Gautam Tiwari
Guitang Lan
Ariya Rastrow
98
26
0
02 Jul 2019
Visual Wake Words Dataset
Visual Wake Words Dataset
Aakanksha Chowdhery
Pete Warden
Jonathon Shlens
Andrew G. Howard
Rocky Rhodes
VLM
152
111
0
12 Jun 2019
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free
  Inference
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference
Michele Covell
David Marwood
S. Baluja
Nick Johnston
MQ
110
7
0
11 Jun 2019
Data-Free Quantization Through Weight Equalization and Bias Correction
Data-Free Quantization Through Weight Equalization and Bias CorrectionIEEE International Conference on Computer Vision (ICCV), 2019
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
Max Welling
MQ
275
580
0
11 Jun 2019
Fighting Quantization Bias With Bias
Fighting Quantization Bias With Bias
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
154
62
0
07 Jun 2019
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network
  Inference On Microcontrollers
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On MicrocontrollersConference on Machine Learning and Systems (MLSys), 2019
Manuele Rusci
Alessandro Capotondi
Luca Benini
MQ
157
83
0
30 May 2019
Searching for MobileNetV3
Searching for MobileNetV3IEEE International Conference on Computer Vision (ICCV), 2019
Andrew G. Howard
Mark Sandler
Grace Chu
Liang-Chieh Chen
Bo Chen
...
Yukun Zhu
Ruoming Pang
Vijay Vasudevan
Quoc V. Le
Hartwig Adam
1.1K
8,168
0
06 May 2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation
Full-stack Optimization for Accelerating CNNs with FPGA Validation
Bradley McDanel
Shanghang Zhang
H. T. Kung
Xin Dong
MQ
74
2
0
01 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-PrecisionIEEE International Conference on Computer Vision (ICCV), 2019
Zhen Dong
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
181
597
0
29 Apr 2019
Relay: A High-Level Compiler for Deep Learning
Relay: A High-Level Compiler for Deep Learning
Jared Roesch
Steven Lyubomirsky
Marisa Kirisame
Logan Weber
Josh Pollock
Luis Vega
Ziheng Jiang
Tianqi Chen
T. Moreau
Zachary Tatlock
134
21
0
17 Apr 2019
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point
  Inference of Deep Neural Networks
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural NetworksConference on Machine Learning and Systems (MLSys), 2019
Sambhav R. Jain
Albert Gural
Michael Wu
Chris Dick
MQ
292
163
0
19 Mar 2019
Learning low-precision neural networks without Straight-Through
  Estimator(STE)
Learning low-precision neural networks without Straight-Through Estimator(STE)International Joint Conference on Artificial Intelligence (IJCAI), 2019
Z. G. Liu
Matthew Mattina
MQ
194
37
0
04 Mar 2019
AutoQ: Automated Kernel-Wise Neural Network Quantization
AutoQ: Automated Kernel-Wise Neural Network Quantization
Qian Lou
Feng Guo
Lantao Liu
Minje Kim
Lei Jiang
MQ
234
109
0
15 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error
  Through Weight Factorization
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Eldad Meller
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
153
93
0
05 Feb 2019
Information-Theoretic Understanding of Population Risk Improvement with
  Model Compression
Information-Theoretic Understanding of Population Risk Improvement with Model Compression
Yuheng Bu
Weihao Gao
Shaofeng Zou
Venugopal V. Veeravalli
MedIm
96
18
0
27 Jan 2019
Efficient Winograd Convolution via Integer Arithmetic
Efficient Winograd Convolution via Integer Arithmetic
Lingchuan Meng
J. Brothers
136
29
0
07 Jan 2019
Previous
123...10119
Next