ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.08342
  4. Cited By
Quantizing deep convolutional networks for efficient inference: A
  whitepaper

Quantizing deep convolutional networks for efficient inference: A whitepaper

21 June 2018
Raghuraman Krishnamoorthi
    MQ
ArXivPDFHTML

Papers citing "Quantizing deep convolutional networks for efficient inference: A whitepaper"

50 / 465 papers shown
Title
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware
  Training
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training
Yunshan Zhong
Gongrui Nan
Yu-xin Zhang
Fei Chao
Rongrong Ji
MQ
18
3
0
12 Nov 2022
MinUn: Accurate ML Inference on Microcontrollers
MinUn: Accurate ML Inference on Microcontrollers
Shikhar Jaiswal
R. Goli
Aayan Kumar
Vivek Seshadri
Rahul Sharma
26
2
0
29 Oct 2022
TPU-MLIR: A Compiler For TPU Using MLIR
TPU-MLIR: A Compiler For TPU Using MLIR
Pengchao Hu
Man Lu
Lei Wang
Guoyue Jiang
14
5
0
23 Oct 2022
Post-Training Quantization for Energy Efficient Realization of Deep
  Neural Networks
Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks
Cecilia Latotzke
Batuhan Balim
T. Gemmeke
MQ
8
2
0
14 Oct 2022
Energy-Efficient Deployment of Machine Learning Workloads on
  Neuromorphic Hardware
Energy-Efficient Deployment of Machine Learning Workloads on Neuromorphic Hardware
Peyton S. Chandarana
Mohammadreza Mohammadi
J. Seekings
Ramtin Zand
31
6
0
10 Oct 2022
A Closer Look at Hardware-Friendly Weight Quantization
A Closer Look at Hardware-Friendly Weight Quantization
Sungmin Bae
Piotr Zielinski
S. Chatterjee
MQ
24
0
0
07 Oct 2022
Physics-aware Differentiable Discrete Codesign for Diffractive Optical
  Neural Networks
Physics-aware Differentiable Discrete Codesign for Diffractive Optical Neural Networks
Yingjie Li
Ruiyang Chen
Weilu Gao
Cunxi Yu
22
11
0
28 Sep 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language
  Models
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
22
145
0
27 Sep 2022
Going Further With Winograd Convolutions: Tap-Wise Quantization for
  Efficient Inference on 4x4 Tile
Going Further With Winograd Convolutions: Tap-Wise Quantization for Efficient Inference on 4x4 Tile
Renzo Andri
Beatrice Bussolino
A. Cipolletta
Lukas Cavigelli
Zhe Wang
MQ
26
13
0
26 Sep 2022
FoVolNet: Fast Volume Rendering using Foveated Deep Neural Networks
FoVolNet: Fast Volume Rendering using Foveated Deep Neural Networks
David Bauer
Qi Wu
Kwan-Liu Ma
3DH
28
19
0
20 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for
  Vision Transformers
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViT
MQ
43
33
0
13 Sep 2022
A simple approach for quantizing neural networks
A simple approach for quantizing neural networks
J. Maly
Rayan Saab
MQ
22
11
0
07 Sep 2022
Seeking Interpretability and Explainability in Binary Activated Neural
  Networks
Seeking Interpretability and Explainability in Binary Activated Neural Networks
Benjamin Leblanc
Pascal Germain
FAtt
37
1
0
07 Sep 2022
XCAT -- Lightweight Quantized Single Image Super-Resolution using
  Heterogeneous Group Convolutions and Cross Concatenation
XCAT -- Lightweight Quantized Single Image Super-Resolution using Heterogeneous Group Convolutions and Cross Concatenation
Mustafa Ayazoglu
Bahri Batuhan Bilecen
SupR
12
4
0
31 Aug 2022
GHN-Q: Parameter Prediction for Unseen Quantized Convolutional
  Architectures via Graph Hypernetworks
GHN-Q: Parameter Prediction for Unseen Quantized Convolutional Architectures via Graph Hypernetworks
S. Yun
Alexander Wong
GNN
MQ
11
1
0
26 Aug 2022
Adaptation of MobileNetV2 for Face Detection on Ultra-Low Power Platform
Adaptation of MobileNetV2 for Face Detection on Ultra-Low Power Platform
Simon Narduzzi
Engin Turetken
Jean-Philippe Thiran
L. A. Dunbar
3DH
CVBM
11
1
0
23 Aug 2022
FP8 Quantization: The Power of the Exponent
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
25
78
0
19 Aug 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
21
11
0
11 Aug 2022
Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips
Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips
Jiawang Bai
Kuofeng Gao
Dihong Gong
Shutao Xia
Zhifeng Li
W. Liu
AAML
22
27
0
27 Jul 2022
Reconciling Security and Communication Efficiency in Federated Learning
Reconciling Security and Communication Efficiency in Federated Learning
Karthik Prasad
Sayan Ghosh
Graham Cormode
Ilya Mironov
Ashkan Yousefpour
Pierre Stock
FedML
30
8
0
26 Jul 2022
Quantized Sparse Weight Decomposition for Neural Network Compression
Quantized Sparse Weight Decomposition for Neural Network Compression
Andrey Kuzmin
M. V. Baalen
Markus Nagel
Arash Behboodi
MQ
6
3
0
22 Jul 2022
CheckINN: Wide Range Neural Network Verification in Imandra (Extended)
CheckINN: Wide Range Neural Network Verification in Imandra (Extended)
Remi Desmartin
Grant Passmore
Ekaterina Komendantskaya
M. Daggitt
26
5
0
21 Jul 2022
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
Chee Hong
Sungyong Baik
Heewon Kim
Seungjun Nah
Kyoung Mu Lee
SupR
MQ
28
32
0
21 Jul 2022
Approximation Capabilities of Neural Networks using Morphological
  Perceptrons and Generalizations
Approximation Capabilities of Neural Networks using Morphological Perceptrons and Generalizations
William Chang
Hassan Hamad
K. Chugg
20
2
0
16 Jul 2022
DiverGet: A Search-Based Software Testing Approach for Deep Neural
  Network Quantization Assessment
DiverGet: A Search-Based Software Testing Approach for Deep Neural Network Quantization Assessment
Ahmed Haj Yahmed
Houssem Ben Braiek
Foutse Khomh
S. Bouzidi
Rania Zaatour
MQ
28
7
0
13 Jul 2022
CEG4N: Counter-Example Guided Neural Network Quantization Refinement
CEG4N: Counter-Example Guided Neural Network Quantization Refinement
J. Matos
I. Bessa
Edoardo Manino
Xidan Song
Lucas C. Cordeiro
MQ
40
2
0
09 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
51
95
0
04 Jul 2022
Matryoshka: Stealing Functionality of Private ML Data by Hiding Models
  in Model
Matryoshka: Stealing Functionality of Private ML Data by Hiding Models in Model
Xudong Pan
Yifan Yan
Sheng Zhang
Mi Zhang
Min Yang
27
1
0
29 Jun 2022
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network
Vitaliy Chiley
Vithursan Thangarasa
Abhay Gupta
Anshul Samar
Joel Hestness
D. DeCoste
48
8
0
28 Jun 2022
Quantization Robust Federated Learning for Efficient Inference on
  Heterogeneous Devices
Quantization Robust Federated Learning for Efficient Inference on Heterogeneous Devices
Kartik Gupta
Marios Fournarakis
M. Reisser
Christos Louizos
Markus Nagel
FedML
14
14
0
22 Jun 2022
QuantFace: Towards Lightweight Face Recognition by Synthetic Data
  Low-bit Quantization
QuantFace: Towards Lightweight Face Recognition by Synthetic Data Low-bit Quantization
Fadi Boutros
Naser Damer
Arjan Kuijper
CVBM
MQ
22
37
0
21 Jun 2022
Low-Precision Stochastic Gradient Langevin Dynamics
Low-Precision Stochastic Gradient Langevin Dynamics
Ruqi Zhang
A. Wilson
Chris De Sa
BDL
21
14
0
20 Jun 2022
tinySNN: Towards Memory- and Energy-Efficient Spiking Neural Networks
tinySNN: Towards Memory- and Energy-Efficient Spiking Neural Networks
Rachmad Vidya Wicaksana Putra
Muhammad Shafique
19
6
0
17 Jun 2022
Canonical convolutional neural networks
Canonical convolutional neural networks
Lokesh Veeramacheneni
Moritz Wolter
Reinhard Klein
Jochen Garcke
15
3
0
03 Jun 2022
Machine Learning for Microcontroller-Class Hardware: A Review
Machine Learning for Microcontroller-Class Hardware: A Review
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
24
118
0
29 May 2022
lpSpikeCon: Enabling Low-Precision Spiking Neural Network Processing for
  Efficient Unsupervised Continual Learning on Autonomous Agents
lpSpikeCon: Enabling Low-Precision Spiking Neural Network Processing for Efficient Unsupervised Continual Learning on Autonomous Agents
Rachmad Vidya Wicaksana Putra
Muhammad Shafique
26
16
0
24 May 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
Peng Hu
Xi Peng
Hongyuan Zhu
M. Aly
Jie Lin
MQ
41
59
0
23 May 2022
Semi-Supervised Learning for Image Classification using Compact Networks
  in the BioMedical Context
Semi-Supervised Learning for Image Classification using Compact Networks in the BioMedical Context
Adrián Inés
Andrés Díaz-Pinto
C. Domínguez
Jónathan Heras
Eloy J. Mata
Vico Pascual
19
1
0
19 May 2022
Adaptive Block Floating-Point for Analog Deep Learning Hardware
Adaptive Block Floating-Point for Analog Deep Learning Hardware
Ayon Basumallik
D. Bunandar
Nicholas Dronen
Nicholas Harris
Ludmila Levkova
Calvin McCarter
Lakshmi Nair
David Walter
David Widemann
9
6
0
12 May 2022
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Yihan Wang
Muyang Li
Han Cai
Wei-Ming Chen
Song Han
3DH
18
71
0
03 May 2022
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training
  Quantization
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization
Hongyi Yao
Pu Li
Jian Cao
Xiangcheng Liu
Chenying Xie
Bin Wang
MQ
19
12
0
26 Apr 2022
Multi-Component Optimization and Efficient Deployment of Neural-Networks
  on Resource-Constrained IoT Hardware
Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware
B. Sudharsan
Dineshkumar Sundaram
Pankesh Patel
J. Breslin
M. Ali
Schahram Dustdar
Albert Zomaya
R. Ranjan
13
2
0
20 Apr 2022
High Efficiency Pedestrian Crossing Prediction
High Efficiency Pedestrian Crossing Prediction
Z. Zeng
26
0
0
04 Apr 2022
To Fold or Not to Fold: a Necessary and Sufficient Condition on
  Batch-Normalization Layers Folding
To Fold or Not to Fold: a Necessary and Sufficient Condition on Batch-Normalization Layers Folding
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
27
7
0
28 Mar 2022
REx: Data-Free Residual Quantization Error Expansion
REx: Data-Free Residual Quantization Error Expansion
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
28
8
0
28 Mar 2022
SPIQ: Data-Free Per-Channel Static Input Quantization
SPIQ: Data-Free Per-Channel Static Input Quantization
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
MQ
16
18
0
28 Mar 2022
Vision Transformer Compression with Structured Pruning and Low Rank
  Approximation
Vision Transformer Compression with Structured Pruning and Low Rank Approximation
Ankur Kumar
ViT
15
6
0
25 Mar 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed
  Low-Precision DNNs with Dynamic Fixed-Point Representation
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
18
6
0
22 Mar 2022
Overcoming Oscillations in Quantization-Aware Training
Overcoming Oscillations in Quantization-Aware Training
Markus Nagel
Marios Fournarakis
Yelysei Bondarenko
Tijmen Blankevoort
MQ
111
101
0
21 Mar 2022
Online Continual Learning for Embedded Devices
Online Continual Learning for Embedded Devices
Tyler L. Hayes
Christopher Kanan
CLL
25
54
0
21 Mar 2022
Previous
123456...8910
Next