ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXivPDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 465 papers shown
Title
Learning Compressed Embeddings for On-Device Inference
Learning Compressed Embeddings for On-Device Inference
Niketan Pansare
J. Katukuri
Aditya Arora
F. Cipollone
R. Shaik
Noyan Tokgozoglu
Chandru Venkataraman
24
14
0
18 Mar 2022
Delta Distillation for Efficient Video Processing
Delta Distillation for Efficient Video Processing
A. Habibian
H. Yahia
Davide Abati
E. Gavves
Fatih Porikli
11
10
0
17 Mar 2022
Hardware Approximate Techniques for Deep Neural Network Accelerators: A
  Survey
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey
Giorgos Armeniakos
Georgios Zervakis
Dimitrios Soudris
J. Henkel
204
93
0
16 Mar 2022
An Empirical Study of Low Precision Quantization for TinyML
An Empirical Study of Low Precision Quantization for TinyML
Shaojie Zhuo
Hongyu Chen
R. Ramakrishnan
Tommy Chen
Chen Feng
Yi-Rung Lin
Parker Zhang
Liang Shen
MQ
32
13
0
10 Mar 2022
AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch
AdaPT: Fast Emulation of Approximate DNN Accelerators in PyTorch
Dimitrios Danopoulos
Georgios Zervakis
K. Siozios
Dimitrios Soudris
J. Henkel
22
31
0
08 Mar 2022
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution
  Networks
Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks
Yunshan Zhong
Mingbao Lin
Xunchao Li
Ke Li
Yunhang Shen
Fei Chao
Yongjian Wu
Rongrong Ji
MQ
21
25
0
08 Mar 2022
YONO: Modeling Multiple Heterogeneous Neural Networks on
  Microcontrollers
YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers
Young D. Kwon
Jagmohan Chauhan
Cecilia Mascolo
16
13
0
08 Mar 2022
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Patch Similarity Aware Data-Free Quantization for Vision Transformers
Zhikai Li
Liping Ma
Mengjuan Chen
Junrui Xiao
Qingyi Gu
MQ
ViT
17
44
0
04 Mar 2022
Comprehensive Analysis of the Object Detection Pipeline on UAVs
Comprehensive Analysis of the Object Detection Pipeline on UAVs
Leon Amadeus Varga
Sebastian Koch
A. Zell
17
5
0
01 Mar 2022
LG-LSQ: Learned Gradient Linear Symmetric Quantization
LG-LSQ: Learned Gradient Linear Symmetric Quantization
Shih-Ting Lin
Zhaofang Li
Yu-Hsiang Cheng
Hao-Wen Kuo
Chih-Cheng Lu
K. Tang
MQ
31
2
0
18 Feb 2022
Post-Training Quantization for Cross-Platform Learned Image Compression
Post-Training Quantization for Cross-Platform Learned Image Compression
Dailan He
Zi Yang
Yuan-Hsin Chen
Qi Zhang
Hongwei Qin
Yan Wang
MQ
37
13
0
15 Feb 2022
BED: A Real-Time Object Detection System for Edge Devices
BED: A Real-Time Object Detection System for Edge Devices
Guanchu Wang
Zaid Pervaiz Bhat
Zhimeng Jiang
Yi-Wei Chen
Daochen Zha
...
A. Niktash
Mehmet Gorkem Ulkar
O. E. Okman
Xuanting Cai
Xia Hu
11
11
0
14 Feb 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
24
48
0
10 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks
  using Extreme Gradient Boosting for Fast Deployment
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
17
17
0
10 Feb 2022
Binary Neural Networks as a general-propose compute paradigm for
  on-device computer vision
Binary Neural Networks as a general-propose compute paradigm for on-device computer vision
Guhong Nie
Lirui Xiao
Menglong Zhu
Dongliang Chu
Yue-Hong Shen
Peng Li
Kan Yang
Li Du
Bo Chen Dji Innovations Inc
MQ
34
5
0
08 Feb 2022
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory
  Footprint Reduction
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Georgii Sergeevich Novikov
Daniel Bershatsky
Julia Gusak
Alex Shonenkov
Denis Dimitrov
Ivan V. Oseledets
MQ
26
17
0
01 Feb 2022
COIN++: Neural Compression Across Modalities
COIN++: Neural Compression Across Modalities
Emilien Dupont
H. Loya
Milad Alizadeh
Adam Goliñski
Yee Whye Teh
Arnaud Doucet
53
82
0
30 Jan 2022
Post-training Quantization for Neural Networks with Provable Guarantees
Post-training Quantization for Neural Networks with Provable Guarantees
Jinjie Zhang
Yixuan Zhou
Rayan Saab
MQ
23
31
0
26 Jan 2022
Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)
Neural Network Quantization with AI Model Efficiency Toolkit (AIMET)
S. Siddegowda
Marios Fournarakis
Markus Nagel
Tijmen Blankevoort
Chirag I. Patel
Abhijit Khobare
MQ
12
31
0
20 Jan 2022
Problem-dependent attention and effort in neural networks with
  applications to image resolution and model selection
Problem-dependent attention and effort in neural networks with applications to image resolution and model selection
Chris Rohlfs
16
4
0
05 Jan 2022
BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of
  DNNs from Scratch
BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch
Souvik Kundu
Shikai Wang
Qirui Sun
P. Beerel
Massoud Pedram
MQ
26
18
0
24 Dec 2021
Training Quantized Deep Neural Networks via Cooperative Coevolution
Training Quantized Deep Neural Networks via Cooperative Coevolution
Fu Peng
Shengcai Liu
Ning Lu
Ke Tang
MQ
21
1
0
23 Dec 2021
Implicit Neural Video Compression
Implicit Neural Video Compression
Yunfan Zhang
T. V. Rozendaal
Johann Brehmer
Markus Nagel
Taco S. Cohen
49
57
0
21 Dec 2021
Torch.fx: Practical Program Capture and Transformation for Deep Learning
  in Python
Torch.fx: Practical Program Capture and Transformation for Deep Learning in Python
James K. Reed
Zach DeVito
Horace He
Ansley Ussery
Jason Ansel
CLIP
17
46
0
15 Dec 2021
BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks
BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks
B. Ghavami
Mani Sadati
M. Shahidzadeh
Zhenman Fang
Lesley Shannon
AAML
19
1
0
07 Dec 2021
A Generalized Zero-Shot Quantization of Deep Convolutional Neural
  Networks via Learned Weights Statistics
A Generalized Zero-Shot Quantization of Deep Convolutional Neural Networks via Learned Weights Statistics
Prasen Kumar Sharma
Arun Abraham
V. N. Rajendiran
MQ
25
7
0
06 Dec 2021
TinyML Platforms Benchmarking
TinyML Platforms Benchmarking
Anas Osman
Usman Abid
Luca Gemma
Matteo Perotto
Davide Brunelli
ELM
30
15
0
30 Nov 2021
Energy-Efficient Inference on the Edge Exploiting TinyML Capabilities
  for UAVs
Energy-Efficient Inference on the Edge Exploiting TinyML Capabilities for UAVs
Wamiq Raza
Anas Osman
F. Ferrini
F. D. De Natale
19
29
0
30 Nov 2021
IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for
  Zero-Shot Network Quantization
IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization
Yunshan Zhong
Mingbao Lin
Gongrui Nan
Jianzhuang Liu
Baochang Zhang
Yonghong Tian
Rongrong Ji
MQ
43
71
0
17 Nov 2021
Variability-Aware Training and Self-Tuning of Highly Quantized DNNs for
  Analog PIM
Variability-Aware Training and Self-Tuning of Highly Quantized DNNs for Analog PIM
Zihao Deng
Michael Orshansky
MQ
37
6
0
11 Nov 2021
An Underexplored Dilemma between Confidence and Calibration in Quantized
  Neural Networks
An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks
Guoxuan Xia
Sangwon Ha
Tiago Azevedo
Partha P. Maji
UQCV
17
1
0
10 Nov 2021
ML-EXray: Visibility into ML Deployment on the Edge
ML-EXray: Visibility into ML Deployment on the Edge
Hang Qiu
Ioanna Vavelidou
Jian Li
Evgenya Pergament
Pete Warden
Sandeep P. Chinchali
Zain Asgar
Sachin Katti
14
8
0
08 Nov 2021
LiMuSE: Lightweight Multi-modal Speaker Extraction
LiMuSE: Lightweight Multi-modal Speaker Extraction
Qinghua Liu
Yating Huang
Yunzhe Hao
Jiaming Xu
Bo Xu
35
6
0
07 Nov 2021
MQBench: Towards Reproducible and Deployable Model Quantization
  Benchmark
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li
Mingzhu Shen
Jian Ma
Yan Ren
Mingxin Zhao
Qi Zhang
Ruihao Gong
F. Yu
Junjie Yan
MQ
35
49
0
05 Nov 2021
Beyond Classification: Knowledge Distillation using Multi-Object
  Impressions
Beyond Classification: Knowledge Distillation using Multi-Object Impressions
Gaurav Kumar Nayak
Monish Keswani
Sharan Seshadri
Anirban Chakraborty
18
2
0
27 Oct 2021
NeRV: Neural Representations for Videos
NeRV: Neural Representations for Videos
Hao Chen
Bo He
Hanyu Wang
Yixuan Ren
Ser-Nam Lim
Abhinav Shrivastava
23
241
0
26 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of
  Weights and Activations
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
29
11
0
15 Oct 2021
PTQ-SL: Exploring the Sub-layerwise Post-training Quantization
PTQ-SL: Exploring the Sub-layerwise Post-training Quantization
Zhihang Yuan
Yiqi Chen
Chenhao Xue
Chenguang Zhang
Qiankun Wang
Guangyu Sun
MQ
11
3
0
15 Oct 2021
Memory-Efficient CNN Accelerator Based on Interlayer Feature Map
  Compression
Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression
Zhuang Shao
Xiaoliang Chen
Li Du
Lei Chen
Yuan Du
Weihao Zhuang
Huadong Wei
Chenjia Xie
Zhongfeng Wang
13
26
0
12 Oct 2021
SoftNeuro: Fast Deep Inference using Multi-platform Optimization
SoftNeuro: Fast Deep Inference using Multi-platform Optimization
Masaki Hilaga
Yasuhiro Kuroda
Hitoshi Matsuo
Tatsuya Kawaguchi
Gabriel Ogawa
Hiroshi Miyake
Yusuke Kozawa
23
1
0
12 Oct 2021
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting
  and Output Merging
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
22
15
0
30 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
12
133
0
27 Sep 2021
iRNN: Integer-only Recurrent Neural Network
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
45
4
0
20 Sep 2021
HPTQ: Hardware-Friendly Post Training Quantization
HPTQ: Hardware-Friendly Post Training Quantization
H. Habi
Reuven Peretz
Elad Cohen
Lior Dikstein
Oranit Dror
I. Diamant
Roy H. Jennings
Arnon Netzer
MQ
31
8
0
19 Sep 2021
Fine-grained Data Distribution Alignment for Post-Training Quantization
Fine-grained Data Distribution Alignment for Post-Training Quantization
Yunshan Zhong
Mingbao Lin
Mengzhao Chen
Ke Li
Yunhang Shen
Fei Chao
Yongjian Wu
Rongrong Ji
MQ
84
19
0
09 Sep 2021
GDP: Stabilized Neural Network Pruning via Gates with Differentiable
  Polarization
GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization
Yi Guo
Huan Yuan
Jianchao Tan
Zhangyang Wang
Sen Yang
Ji Liu
23
46
0
06 Sep 2021
Quantization of Generative Adversarial Networks for Efficient Inference:
  a Methodological Study
Quantization of Generative Adversarial Networks for Efficient Inference: a Methodological Study
Pavel Andreev
Alexander Fritzler
Dmitry Vetrov
MQ
19
10
0
31 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
4-bit Quantization of LSTM-based Speech Recognition Models
4-bit Quantization of LSTM-based Speech Recognition Models
A. Fasoli
Chia-Yu Chen
Mauricio Serrano
Xiao Sun
Naigang Wang
...
Xiaodong Cui
Brian Kingsbury
Wei Zhang
Zoltán Tüske
K. Gopalakrishnan
MQ
26
21
0
27 Aug 2021
Distance-aware Quantization
Distance-aware Quantization
Dohyung Kim
Junghyup Lee
Bumsub Ham
MQ
15
28
0
16 Aug 2021
Previous
123...1056789
Next