ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXivPDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 465 papers shown
Title
Bias Loss for Mobile Neural Networks
Bias Loss for Mobile Neural Networks
L. Abrahamyan
Valentin Ziatchin
Yiming Chen
Nikos Deligiannis
15
14
0
23 Jul 2021
A High-Performance Adaptive Quantization Approach for Edge CNN
  Applications
A High-Performance Adaptive Quantization Approach for Edge CNN Applications
Hsu-Hsun Chin
R. Tsay
Hsin-I Wu
MQ
8
5
0
18 Jul 2021
LANA: Latency Aware Network Acceleration
LANA: Latency Aware Network Acceleration
Pavlo Molchanov
Jimmy Hall
Hongxu Yin
Jan Kautz
Nicolò Fusi
Arash Vahdat
25
11
0
12 Jul 2021
Q-SpiNN: A Framework for Quantizing Spiking Neural Networks
Q-SpiNN: A Framework for Quantizing Spiking Neural Networks
Rachmad Vidya Wicaksana Putra
Muhammad Shafique
MQ
11
44
0
05 Jul 2021
Popcorn: Paillier Meets Compression For Efficient Oblivious Neural
  Network Inference
Popcorn: Paillier Meets Compression For Efficient Oblivious Neural Network Inference
Jun Wang
Chao Jin
S. Meftah
Khin Mi Mi Aung
UQCV
16
3
0
05 Jul 2021
PQK: Model Compression via Pruning, Quantization, and Knowledge
  Distillation
PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation
Jang-Hyun Kim
Simyung Chang
Nojun Kwak
22
44
0
25 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLM
MedIm
23
365
0
16 Jun 2021
A White Paper on Neural Network Quantization
A White Paper on Neural Network Quantization
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
MQ
19
503
0
15 Jun 2021
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision
  Quantization
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization
Santiago Miret
Vui Seng Chua
Mattias Marder
Mariano Phielipp
Nilesh Jain
Somdeb Majumdar
13
8
0
14 Jun 2021
Rethinking Transfer Learning for Medical Image Classification
Rethinking Transfer Learning for Medical Image Classification
Le Peng
Hengyue Liang
Gaoxiang Luo
Taihui Li
Ju Sun
VLM
LM&MA
14
5
0
09 Jun 2021
Towards Efficient Full 8-bit Integer DNN Online Training on
  Resource-limited Devices without Batch Normalization
Towards Efficient Full 8-bit Integer DNN Online Training on Resource-limited Devices without Batch Normalization
Yukuan Yang
Xiaowei Chi
Lei Deng
Tianyi Yan
Feng Gao
Guoqi Li
MQ
23
6
0
27 May 2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference
  at Scale
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale
Zhaoxia Deng
Deng
Jongsoo Park
P. T. P. Tang
Haixin Liu
...
S. Nadathur
Changkyu Kim
Maxim Naumov
S. Naghshineh
M. Smelyanskiy
15
11
0
26 May 2021
Post-Training Sparsity-Aware Quantization
Post-Training Sparsity-Aware Quantization
Gil Shomron
F. Gabbay
Samer Kurzum
U. Weiser
MQ
33
33
0
23 May 2021
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
Haoping Bai
Mengsi Cao
Ping-Chia Huang
Jiulong Shan
MQ
17
34
0
19 May 2021
Rethinking "Batch" in BatchNorm
Rethinking "Batch" in BatchNorm
Yuxin Wu
Justin Johnson
BDL
37
66
0
17 May 2021
Is In-Domain Data Really Needed? A Pilot Study on Cross-Domain
  Calibration for Network Quantization
Is In-Domain Data Really Needed? A Pilot Study on Cross-Domain Calibration for Network Quantization
Haichao Yu
Linjie Yang
Humphrey Shi
OOD
MQ
24
5
0
16 May 2021
Lightweight Compression of Intermediate Neural Network Features for
  Collaborative Intelligence
Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
R. Cohen
Hyomin Choi
Ivan V. Bajić
19
23
0
15 May 2021
Lightweight compression of neural network feature tensors for
  collaborative intelligence
Lightweight compression of neural network feature tensors for collaborative intelligence
R. Cohen
Hyomin Choi
Ivan V. Bajić
22
42
0
12 May 2021
In-Hindsight Quantization Range Estimation for Quantized Training
In-Hindsight Quantization Range Estimation for Quantized Training
Marios Fournarakis
Markus Nagel
MQ
14
10
0
10 May 2021
Stealthy Backdoors as Compression Artifacts
Stealthy Backdoors as Compression Artifacts
Yulong Tian
Fnu Suya
Fengyuan Xu
David E. Evans
27
22
0
30 Apr 2021
Memory-Efficient Deep Learning Inference in Trusted Execution
  Environments
Memory-Efficient Deep Learning Inference in Trusted Execution Environments
Jean-Baptiste Truong
W. Gallagher
Tian Guo
R. Walls
17
8
0
30 Apr 2021
Do All MobileNets Quantize Poorly? Gaining Insights into the Effect of
  Quantization on Depthwise Separable Convolutional Networks Through the Eyes
  of Multi-scale Distributional Dynamics
Do All MobileNets Quantize Poorly? Gaining Insights into the Effect of Quantization on Depthwise Separable Convolutional Networks Through the Eyes of Multi-scale Distributional Dynamics
S. Yun
Alexander Wong
MQ
19
25
0
24 Apr 2021
Differentiable Model Compression via Pseudo Quantization Noise
Differentiable Model Compression via Pseudo Quantization Noise
Alexandre Défossez
Yossi Adi
Gabriel Synnaeve
DiffM
MQ
15
47
0
20 Apr 2021
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure
  DNN Accelerators
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators
David Stutz
Nandhini Chandramoorthy
Matthias Hein
Bernt Schiele
AAML
MQ
22
18
0
16 Apr 2021
TENT: Efficient Quantization of Neural Networks on the tiny Edge with
  Tapered FixEd PoiNT
TENT: Efficient Quantization of Neural Networks on the tiny Edge with Tapered FixEd PoiNT
H. F. Langroudi
Vedant Karia
Tej Pandit
Dhireesha Kudithipudi
MQ
19
10
0
06 Apr 2021
RCT: Resource Constrained Training for Edge AI
RCT: Resource Constrained Training for Edge AI
Tian Huang
Tao Luo
Ming Yan
Joey Tianyi Zhou
Rick Siow Mong Goh
25
8
0
26 Mar 2021
Learned Gradient Compression for Distributed Deep Learning
Learned Gradient Compression for Distributed Deep Learning
L. Abrahamyan
Yiming Chen
Giannis Bekoulis
Nikos Deligiannis
32
45
0
16 Mar 2021
Quantization-Guided Training for Compact TinyML Models
Quantization-Guided Training for Compact TinyML Models
Sedigh Ghamari
Koray Ozcan
Thu Dinh
A. Melnikov
Juan Carvajal
Jan Ernst
S. Chai
MQ
16
16
0
10 Mar 2021
unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights
  Generation
unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation
Stylianos I. Venieris
Javier Fernandez-Marques
Nicholas D. Lane
24
11
0
09 Mar 2021
Reliability-Aware Quantization for Anti-Aging NPUs
Reliability-Aware Quantization for Anti-Aging NPUs
Sami Salamin
Georgios Zervakis
Ourania Spantidi
Iraklis Anagnostopoulos
J. Henkel
H. Amrouch
8
13
0
08 Mar 2021
Compiler Toolchains for Deep Learning Workloads on Embedded Platforms
Compiler Toolchains for Deep Learning Workloads on Embedded Platforms
Max Sponner
Bernd Waschneck
Akash Kumar
MQ
12
5
0
08 Mar 2021
COIN: COmpression with Implicit Neural representations
COIN: COmpression with Implicit Neural representations
Emilien Dupont
Adam Goliñski
Milad Alizadeh
Yee Whye Teh
Arnaud Doucet
20
223
0
03 Mar 2021
On the Effects of Quantisation on Model Uncertainty in Bayesian Neural
  Networks
On the Effects of Quantisation on Model Uncertainty in Bayesian Neural Networks
Martin Ferianc
Partha P. Maji
Matthew Mattina
Miguel R. D. Rodrigues
UQCV
BDL
14
9
0
22 Feb 2021
PLAM: a Posit Logarithm-Approximate Multiplier
PLAM: a Posit Logarithm-Approximate Multiplier
Raul Murillo
Alberto A. Del Barrio
Guillermo Botella
Min Soo Kim
Hyunjin Kim
N. Bagherzadeh
TPM
24
25
0
18 Feb 2021
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware
  Transformation
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation
Chaofan Tao
Rui Lin
Quan Chen
Zhaoyang Zhang
Ping Luo
Ngai Wong
MQ
26
7
0
15 Feb 2021
Confounding Tradeoffs for Neural Network Quantization
Confounding Tradeoffs for Neural Network Quantization
Sahaj Garg
Anirudh Jain
Joe Lou
Mitchell Nahmias
MQ
21
17
0
12 Feb 2021
Dynamic Precision Analog Computing for Neural Networks
Dynamic Precision Analog Computing for Neural Networks
Sahaj Garg
Joe Lou
Anirudh Jain
Mitchell Nahmias
37
33
0
12 Feb 2021
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision
  Neural Network Inference
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai
Rangharajan Venkatesan
Haoxing Ren
B. Zimmer
W. Dally
Brucek Khailany
MQ
25
67
0
08 Feb 2021
Real-time Denoising and Dereverberation with Tiny Recurrent U-Net
Real-time Denoising and Dereverberation with Tiny Recurrent U-Net
Hyeong-Seok Choi
Sungjin Park
Jie Hwan Lee
Hoon Heo
Dongsuk Jeon
Kyogu Lee
34
57
0
05 Feb 2021
Fixed-point Quantization of Convolutional Neural Networks for Quantized
  Inference on Embedded Platforms
Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedded Platforms
Rishabh Goyal
Joaquin Vanschoren
V. V. Acht
S. Nijssen
MQ
22
23
0
03 Feb 2021
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Hamzah Abdel-Aziz
Ali Shafiee
J. Shin
A. Pedram
Joseph Hassoun
MQ
34
10
0
27 Jan 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
130
673
0
24 Jan 2021
MinConvNets: A new class of multiplication-less Neural Networks
MinConvNets: A new class of multiplication-less Neural Networks
Xuecan Yang
S. Chaudhuri
Laurence Likforman
L. Naviner
14
0
0
23 Jan 2021
Generative Zero-shot Network Quantization
Generative Zero-shot Network Quantization
Xiangyu He
Qinghao Hu
Peisong Wang
Jian Cheng
GAN
MQ
26
23
0
21 Jan 2021
Network Pruning using Adaptive Exemplar Filters
Network Pruning using Adaptive Exemplar Filters
Mingbao Lin
Rongrong Ji
Shaojie Li
Yan Wang
Yongjian Wu
Feiyue Huang
QiXiang Ye
VLM
11
53
0
20 Jan 2021
Multi-Task Network Pruning and Embedded Optimization for Real-time
  Deployment in ADAS
Multi-Task Network Pruning and Embedded Optimization for Real-time Deployment in ADAS
F. Dellinger
T. Boulay
Diego Mendoza Barrenechea
Said El-Hachimi
Isabelle Leang
Fabian Burger
14
2
0
19 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with
  Learned Step Size Quantization
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
17
26
0
15 Jan 2021
NetCut: Real-Time DNN Inference Using Layer Removal
NetCut: Real-Time DNN Inference Using Layer Removal
Mehrshad Zandigohar
Deniz Erdogmus
G. Schirner
15
5
0
13 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural
  Networks
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
R. L. Jin
31
3
0
12 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
96
341
0
05 Jan 2021
Previous
123...106789
Next