Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.08342
Cited By
Quantizing deep convolutional networks for efficient inference: A whitepaper
21 June 2018
Raghuraman Krishnamoorthi
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quantizing deep convolutional networks for efficient inference: A whitepaper"
50 / 465 papers shown
Title
Learning Compressed Embeddings for On-Device Inference
Niketan Pansare
J. Katukuri
Aditya Arora
F. Cipollone
R. Shaik
Noyan Tokgozoglu
Chandru Venkataraman
24
14
0
18 Mar 2022
Delta Distillation for Efficient Video Processing
A. Habibian
H. Yahia
Davide Abati
E. Gavves
Fatih Porikli
11
10
0
17 Mar 2022
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey
Giorgos Armeniakos
Georgios Zervakis
Dimitrios Soudris
J. Henkel
204
93
0
16 Mar 2022
An Empirical Study of Low Precision Quantization for TinyML
Shaojie Zhuo
Hongyu Chen
R. Ramakrishnan
Tommy Chen
Chen Feng
Yi-Rung Lin
Parker Zhang
Liang Shen
MQ
32
13
0
10 Mar 2022
AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch
Dimitrios Danopoulos
Georgios Zervakis
K. Siozios
Dimitrios Soudris
J. Henkel
22
31
0
08 Mar 2022
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks
Yunshan Zhong
Mingbao Lin
Xunchao Li
Ke Li
Yunhang Shen
Fei Chao
Yongjian Wu
Rongrong Ji
MQ
21
25
0
08 Mar 2022
YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
16
13
0
08 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
17
44
0
04 Mar 2022
Comprehensive Analysis of the Object Detection Pipeline on UAVs
Leon Amadeus Varga
Sebastian Koch
A. Zell
17
5
0
01 Mar 2022
LG-LSQ: Learned Gradient Linear Symmetric Quantization
Shih-Ting Lin
Zhaofang Li
Yu-Hsiang Cheng
Hao-Wen Kuo
Chih-Cheng Lu
K. Tang
MQ
31
2
0
18 Feb 2022
Post-Training Quantization for Cross-Platform Learned Image Compression
Dailan He
Zi Yang
Yuan-Hsin Chen
Qi Zhang
Hongwei Qin
Yan Wang
MQ
37
13
0
15 Feb 2022
BED: A Real-Time Object Detection System for Edge Devices
Guanchu Wang
Zaid Pervaiz Bhat
Zhimeng Jiang
Yi-Wei Chen
Daochen Zha
...
A. Niktash
Mehmet Gorkem Ulkar
O. E. Okman
Xuanting Cai
Xia Hu
11
11
0
14 Feb 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
24
48
0
10 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
17
17
0
10 Feb 2022
Binary Neural Networks as a general-propose compute paradigm for on-device computer vision
Guhong Nie
Lirui Xiao
Menglong Zhu
Dongliang Chu
Yue-Hong Shen
Peng Li
Kan Yang
Li Du
Bo Chen Dji Innovations Inc
MQ
34
5
0
08 Feb 2022
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Georgii Sergeevich Novikov
Daniel Bershatsky
Julia Gusak
Alex Shonenkov
Denis Dimitrov
Ivan V. Oseledets
MQ
26
17
0
01 Feb 2022
COIN++: Neural Compression Across Modalities
Emilien Dupont
H. Loya
Milad Alizadeh
Adam Goliñski
Yee Whye Teh
Arnaud Doucet
53
82
0
30 Jan 2022
Post-training Quantization for Neural Networks with Provable Guarantees
Jinjie Zhang
Yixuan Zhou
Rayan Saab
MQ
23
31
0
26 Jan 2022
Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)
S. Siddegowda
Marios Fournarakis
Markus Nagel
Tijmen Blankevoort
Chirag I. Patel
Abhijit Khobare
MQ
12
31
0
20 Jan 2022
Problem-dependent attention and effort in neural networks with applications to image resolution and model selection
Chris Rohlfs
16
4
0
05 Jan 2022
BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch
Souvik Kundu
Shikai Wang
Qirui Sun
P. Beerel
Massoud Pedram
MQ
26
18
0
24 Dec 2021
Training Quantized Deep Neural Networks via Cooperative Coevolution
Fu Peng
Shengcai Liu
Ning Lu
Ke Tang
MQ
21
1
0
23 Dec 2021
Implicit Neural Video Compression
Yunfan Zhang
T. V. Rozendaal
Johann Brehmer
Markus Nagel
Taco S. Cohen
49
57
0
21 Dec 2021
Torch.fx: Practical Program Capture and Transformation for Deep Learning in Python
James K. Reed
Zach DeVito
Horace He
Ansley Ussery
Jason Ansel
CLIP
17
46
0
15 Dec 2021
BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks
B. Ghavami
Mani Sadati
M. Shahidzadeh
Zhenman Fang
Lesley Shannon
AAML
19
1
0
07 Dec 2021
A Generalized Zero-Shot Quantization of Deep Convolutional Neural Networks via Learned Weights Statistics
Prasen Kumar Sharma
Arun Abraham
V. N. Rajendiran
MQ
25
7
0
06 Dec 2021
TinyML Platforms Benchmarking
Anas Osman
Usman Abid
Luca Gemma
Matteo Perotto
Davide Brunelli
ELM
30
15
0
30 Nov 2021
Energy-Efficient Inference on the Edge Exploiting TinyML Capabilities for UAVs
Wamiq Raza
Anas Osman
F. Ferrini
F. D. De Natale
19
29
0
30 Nov 2021
IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization
Yunshan Zhong
Mingbao Lin
Gongrui Nan
Jianzhuang Liu
Baochang Zhang
Yonghong Tian
Rongrong Ji
MQ
43
71
0
17 Nov 2021
Variability-Aware Training and Self-Tuning of Highly Quantized DNNs for Analog PIM
Zihao Deng
Michael Orshansky
MQ
37
6
0
11 Nov 2021
An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks
Guoxuan Xia
Sangwon Ha
Tiago Azevedo
Partha P. Maji
UQCV
17
1
0
10 Nov 2021
ML-EXray: Visibility into ML Deployment on the Edge
Hang Qiu
Ioanna Vavelidou
Jian Li
Evgenya Pergament
Pete Warden
Sandeep P. Chinchali
Zain Asgar
Sachin Katti
14
8
0
08 Nov 2021
LiMuSE: Lightweight Multi-modal Speaker Extraction
Qinghua Liu
Yating Huang
Yunzhe Hao
Jiaming Xu
Bo Xu
35
6
0
07 Nov 2021
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li
Mingzhu Shen
Jian Ma
Yan Ren
Mingxin Zhao
Qi Zhang
Ruihao Gong
F. Yu
Junjie Yan
MQ
35
49
0
05 Nov 2021
Beyond Classification: Knowledge Distillation using Multi-Object Impressions
Gaurav Kumar Nayak
Monish Keswani
Sharan Seshadri
Anirban Chakraborty
18
2
0
27 Oct 2021
NeRV: Neural Representations for Videos
Hao Chen
Bo He
Hanyu Wang
Yixuan Ren
Ser-Nam Lim
Abhinav Shrivastava
23
241
0
26 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
29
11
0
15 Oct 2021
PTQ-SL: Exploring the Sub-layerwise Post-training Quantization
Zhihang Yuan
Yiqi Chen
Chenhao Xue
Chenguang Zhang
Qiankun Wang
Guangyu Sun
MQ
11
3
0
15 Oct 2021
Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
Zhuang Shao
Xiaoliang Chen
Li Du
Lei Chen
Yuan Du
Weihao Zhuang
Huadong Wei
Chenjia Xie
Zhongfeng Wang
13
26
0
12 Oct 2021
SoftNeuro: Fast Deep Inference using Multi-platform Optimization
Masaki Hilaga
Yasuhiro Kuroda
Hitoshi Matsuo
Tatsuya Kawaguchi
Gabriel Ogawa
Hiroshi Miyake
Yusuke Kozawa
23
1
0
12 Oct 2021
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
22
15
0
30 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
12
133
0
27 Sep 2021
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
45
4
0
20 Sep 2021
HPTQ: Hardware-Friendly Post Training Quantization
H. Habi
Reuven Peretz
Elad Cohen
Lior Dikstein
Oranit Dror
I. Diamant
Roy H. Jennings
Arnon Netzer
MQ
31
8
0
19 Sep 2021
Fine-grained Data Distribution Alignment for Post-Training Quantization
Yunshan Zhong
Mingbao Lin
Mengzhao Chen
Ke Li
Yunhang Shen
Fei Chao
Yongjian Wu
Rongrong Ji
MQ
84
19
0
09 Sep 2021
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization
Yi Guo
Huan Yuan
Jianchao Tan
Zhangyang Wang
Sen Yang
Ji Liu
23
46
0
06 Sep 2021
Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study
Pavel Andreev
Alexander Fritzler
Dmitry Vetrov
MQ
19
10
0
31 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
4-bit Quantization of LSTM-based Speech Recognition Models
A. Fasoli
Chia-Yu Chen
Mauricio Serrano
Xiao Sun
Naigang Wang
...
Xiaodong Cui
Brian Kingsbury
Wei Zhang
Zoltán Tüske
K. Gopalakrishnan
MQ
26
21
0
27 Aug 2021
Distance-aware Quantization
Dohyung Kim
Junghyup Lee
Bumsub Ham
MQ
15
28
0
16 Aug 2021
Previous
1
2
3
...
10
5
6
7
8
9
Next