ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXivPDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 464 papers shown
Title
Binary Neural Networks: A Survey
Binary Neural Networks: A Survey
Haotong Qin
Ruihao Gong
Xianglong Liu
Xiao Bai
Jingkuan Song
N. Sebe
MQ
50
457
0
31 Mar 2020
AI on the Edge: Rethinking AI-based IoT Applications Using Specialized
  Edge Architectures
AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures
Qianlin Liang
Prashant J. Shenoy
David E. Irwin
9
17
0
27 Mar 2020
Compiling Neural Networks for a Computational Memory Accelerator
Compiling Neural Networks for a Computational Memory Accelerator
K. Kourtis
M. Dazzi
Nikolas Ioannou
Tobias Grosser
A. Sebastian
E. Eleftheriou
12
5
0
05 Mar 2020
Searching for Winograd-aware Quantized Networks
Searching for Winograd-aware Quantized Networks
Javier Fernandez-Marques
P. Whatmough
Andrew Mundy
Matthew Mattina
MQ
11
40
0
25 Feb 2020
Post-training Quantization with Multiple Points: Mixed Precision without
  Mixed Precision
Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision
Xingchao Liu
Mao Ye
Dengyong Zhou
Qiang Liu
MQ
8
42
0
20 Feb 2020
Neural Network Compression Framework for fast model inference
Neural Network Compression Framework for fast model inference
Alexander Kozlov
Ivan Lazarevich
Vasily Shamporov
N. Lyalyushkin
Yury Gorbachev
23
35
0
20 Feb 2020
SYMOG: learning symmetric mixture of Gaussian modes for improved
  fixed-point quantization
SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization
Lukas Enderich
Fabian Timm
Wolfram Burgard
MQ
14
6
0
19 Feb 2020
Robust Quantization: One Model to Rule Them All
Robust Quantization: One Model to Rule Them All
Moran Shkolnik
Brian Chmiel
Ron Banner
Gil Shomron
Yury Nahshan
A. Bronstein
U. Weiser
OOD
MQ
14
75
0
18 Feb 2020
Gradient $\ell_1$ Regularization for Quantization Robustness
Gradient ℓ1\ell_1ℓ1​ Regularization for Quantization Robustness
Milad Alizadeh
Arash Behboodi
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
Max Welling
MQ
12
8
0
18 Feb 2020
Post-Training Piecewise Linear Quantization for Deep Neural Networks
Post-Training Piecewise Linear Quantization for Deep Neural Networks
Jun Fang
Ali Shafiee
Hamzah Abdel-Aziz
D. Thorsley
Georgios Georgiadis
Joseph Hassoun
MQ
12
144
0
31 Jan 2020
Quantisation and Pruning for Neural Network Compression and
  Regularisation
Quantisation and Pruning for Neural Network Compression and Regularisation
Kimessha Paupamah
Steven D. James
Richard Klein
9
23
0
14 Jan 2020
ZeroQ: A Novel Zero Shot Quantization Framework
ZeroQ: A Novel Zero Shot Quantization Framework
Yaohui Cai
Z. Yao
Zhen Dong
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
30
389
0
01 Jan 2020
Towards Unified INT8 Training for Convolutional Neural Network
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu
Ruihao Gong
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
27
151
0
29 Dec 2019
Towards Efficient Training for Neural Network Quantization
Towards Efficient Training for Neural Network Quantization
Qing Jin
Linjie Yang
Zhenyu A. Liao
MQ
11
42
0
21 Dec 2019
Taxonomy and Evaluation of Structured Compression of Convolutional
  Neural Networks
Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks
Andrey Kuzmin
Markus Nagel
Saurabh Pitre
Sandeep Pendyam
Tijmen Blankevoort
Max Welling
9
27
0
20 Dec 2019
Learned Variable-Rate Image Compression with Residual Divisive
  Normalization
Learned Variable-Rate Image Compression with Residual Divisive Normalization
Mohammad Akbari
Jie Liang
Jingning Han
Chengjie Tu
19
25
0
11 Dec 2019
The Knowledge Within: Methods for Data-Free Model Compression
The Knowledge Within: Methods for Data-Free Model Compression
Matan Haroush
Itay Hubara
Elad Hoffer
Daniel Soudry
18
105
0
03 Dec 2019
QKD: Quantization-aware Knowledge Distillation
QKD: Quantization-aware Knowledge Distillation
Jangho Kim
Yash Bhalgat
Jinwon Lee
Chirag I. Patel
Nojun Kwak
MQ
16
63
0
28 Nov 2019
Loss Aware Post-training Quantization
Loss Aware Post-training Quantization
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
26
163
0
17 Nov 2019
Scientific Image Restoration Anywhere
Scientific Image Restoration Anywhere
V. Abeykoon
Zhengchun Liu
R. Kettimuthu
Geoffrey C. Fox
Ian T. Foster
16
19
0
12 Nov 2019
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
26
274
0
10 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables
Post-Training 4-bit Quantization on Embedding Tables
Hui Guan
Andrey Malevich
Jiyan Yang
Jongsoo Park
Hector Yuen
MQ
11
32
0
05 Nov 2019
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
Training DNN IoT Applications for Deployment On Analog NVM Crossbars
F. García-Redondo
Shidhartha Das
G. Rosendale
17
5
0
30 Oct 2019
Secure Evaluation of Quantized Neural Networks
Secure Evaluation of Quantized Neural Networks
Anders Dalskov
Daniel E. Escudero
Marcel Keller
12
137
0
28 Oct 2019
Neural Network Distiller: A Python Package For DNN Compression Research
Neural Network Distiller: A Python Package For DNN Compression Research
Neta Zmora
Guy Jacob
Lev Zlotnik
Bar Elharar
Gal Novik
17
73
0
27 Oct 2019
Deep Learning at the Edge
Deep Learning at the Edge
Sahar Voghoei
N. Tonekaboni
Jason G. Wallace
H. Arabnia
11
41
0
22 Oct 2019
Automatic Generation of Multi-precision Multi-arithmetic CNN
  Accelerators for FPGAs
Automatic Generation of Multi-precision Multi-arithmetic CNN Accelerators for FPGAs
Yiren Zhao
Xitong Gao
Xuan Guo
Junyi Liu
Erwei Wang
Robert D. Mullins
P. Cheung
G. Constantinides
Chengzhong Xu
MQ
19
31
0
21 Oct 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019
AI Benchmark: All About Deep Learning on Smartphones in 2019
Andrey D. Ignatov
Radu Timofte
Andrei Kulik
Seungsoo Yang
Ke Wang
Felix Baum
Max Wu
Lirong Xu
Luc Van Gool
ELM
13
218
0
15 Oct 2019
Bit Efficient Quantization for Deep Neural Networks
Bit Efficient Quantization for Deep Neural Networks
Prateeth Nayak
David C. Zhang
S. Chai
MQ
25
43
0
07 Oct 2019
QuaRL: Quantization for Fast and Environmentally Sustainable
  Reinforcement Learning
QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning
Srivatsan Krishnan
Maximilian Lam
Sharad Chitlangia
Zishen Wan
Gabriel Barth-Maron
Aleksandra Faust
Vijay Janapa Reddi
MQ
21
22
0
02 Oct 2019
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques
NGEMM: Optimizing GEMM for Deep Learning via Compiler-based Techniques
Wenlei Bao
Li-Wen Chang
Yang Chen
Kefeng Deng
Amit Agarwal
Emad Barsoum
Abe Taha
MQ
11
7
0
01 Oct 2019
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement
  Learning Onboard a Nano Drone Microcontroller
Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller
Bardienus P. Duisterhof
Srivatsan Krishnan
Jonathan J. Cruz
Colby R. Banbury
William Fu
Aleksandra Faust
Guido de Croon
Vijay Janapa Reddi
18
25
0
25 Sep 2019
Training Deep Neural Networks Using Posit Number System
Training Deep Neural Networks Using Posit Number System
Jinming Lu
Siyuan Lu
Zhisheng Wang
Chao Fang
Jun Lin
Zhongfeng Wang
Li Du
MQ
19
13
0
06 Sep 2019
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit
  Neural Networks
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
Ruihao Gong
Xianglong Liu
Shenghu Jiang
Tian-Hao Li
Peng Hu
Jiazhen Lin
F. Yu
Junjie Yan
MQ
21
445
0
14 Aug 2019
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for
  DNNs on the Edge
Cheetah: Mixed Low-Precision Hardware & Software Co-Design Framework for DNNs on the Edge
H. F. Langroudi
Zachariah Carmichael
David Pastuch
Dhireesha Kudithipudi
14
24
0
06 Aug 2019
Scalable Multi Corpora Neural Language Models for ASR
Scalable Multi Corpora Neural Language Models for ASR
A. Raju
Denis Filimonov
Gautam Tiwari
Guitang Lan
Ariya Rastrow
11
26
0
02 Jul 2019
Visual Wake Words Dataset
Visual Wake Words Dataset
Aakanksha Chowdhery
Pete Warden
Jonathon Shlens
Andrew G. Howard
Rocky Rhodes
VLM
16
98
0
12 Jun 2019
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free
  Inference
Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference
Michele Covell
David Marwood
S. Baluja
Nick Johnston
MQ
11
7
0
11 Jun 2019
Data-Free Quantization Through Weight Equalization and Bias Correction
Data-Free Quantization Through Weight Equalization and Bias Correction
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
Max Welling
MQ
19
499
0
11 Jun 2019
Fighting Quantization Bias With Bias
Fighting Quantization Bias With Bias
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
14
56
0
07 Jun 2019
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network
  Inference On Microcontrollers
Memory-Driven Mixed Low Precision Quantization For Enabling Deep Network Inference On Microcontrollers
Manuele Rusci
Alessandro Capotondi
Luca Benini
MQ
17
74
0
30 May 2019
Searching for MobileNetV3
Searching for MobileNetV3
Andrew G. Howard
Mark Sandler
Grace Chu
Liang-Chieh Chen
Bo Chen
...
Yukun Zhu
Ruoming Pang
Vijay Vasudevan
Quoc V. Le
Hartwig Adam
41
6,600
0
06 May 2019
Full-stack Optimization for Accelerating CNNs with FPGA Validation
Full-stack Optimization for Accelerating CNNs with FPGA Validation
Bradley McDanel
S. Zhang
H. T. Kung
Xin Dong
MQ
14
2
0
01 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Zhen Dong
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
19
513
0
29 Apr 2019
Relay: A High-Level Compiler for Deep Learning
Relay: A High-Level Compiler for Deep Learning
Jared Roesch
Steven Lyubomirsky
Marisa Kirisame
Logan Weber
Josh Pollock
Luis Vega
Ziheng Jiang
Tianqi Chen
T. Moreau
Zachary Tatlock
20
21
0
17 Apr 2019
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point
  Inference of Deep Neural Networks
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks
Sambhav R. Jain
Albert Gural
Michael Wu
Chris Dick
MQ
13
147
0
19 Mar 2019
Learning low-precision neural networks without Straight-Through
  Estimator(STE)
Learning low-precision neural networks without Straight-Through Estimator(STE)
Z. G. Liu
Matthew Mattina
MQ
19
34
0
04 Mar 2019
AutoQ: Automated Kernel-Wise Neural Network Quantization
AutoQ: Automated Kernel-Wise Neural Network Quantization
Qian Lou
Feng Guo
Lantao Liu
Minje Kim
Lei Jiang
MQ
16
97
0
15 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error
  Through Weight Factorization
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Eldad Meller
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
16
85
0
05 Feb 2019
Information-Theoretic Understanding of Population Risk Improvement with
  Model Compression
Information-Theoretic Understanding of Population Risk Improvement with Model Compression
Yuheng Bu
Weihao Gao
Shaofeng Zou
V. Veeravalli
MedIm
8
15
0
27 Jan 2019
Previous
123...1089
Next