Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1811.08886
Cited By
v1
v2
v3 (latest)
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Computer Vision and Pattern Recognition (CVPR), 2018
21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HAQ: Hardware-Aware Automated Quantization with Mixed Precision"
50 / 464 papers shown
Title
MinUn: Accurate ML Inference on Microcontrollers
ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2022
Shikhar Jaiswal
R. Goli
Aayan Kumar
Vivek Seshadri
Rahul Sharma
265
5
0
29 Oct 2022
Fast DistilBERT on CPUs
Haihao Shen
Ofir Zafrir
Bo Dong
Hengyu Meng
Xinyu. Ye
Zhe Wang
Yi Ding
Hanwen Chang
Guy Boudoukh
Moshe Wasserblat
VLM
237
2
0
27 Oct 2022
Zero-Shot Learning of a Conditional Generative Adversarial Network for Data-Free Network Quantization
International Conference on Information Photonics (ICIP), 2021
Yoojin Choi
Mostafa El-Khamy
Jungwon Lee
GAN
145
1
0
26 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
British Machine Vision Conference (BMVC), 2022
Theo W. Costain
V. Prisacariu
159
0
0
17 Oct 2022
ODG-Q: Robust Quantization via Online Domain Generalization
International Conference on Pattern Recognition (ICPR), 2022
Chaofan Tao
Ngai Wong
MQ
137
1
0
17 Oct 2022
FIT: A Metric for Model Sensitivity
International Conference on Learning Representations (ICLR), 2022
Ben Zandonati
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
232
9
0
16 Oct 2022
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
160
2
0
11 Oct 2022
Energy-Efficient Deployment of Machine Learning Workloads on Neuromorphic Hardware
International Green and Sustainable Computing Conference (GSC), 2022
Peyton S. Chandarana
Mohammadreza Mohammadi
J. Seekings
Ramtin Zand
208
8
0
10 Oct 2022
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile Networks
IEEE wireless communications (IEEE Wireless Commun.), 2022
Kaibin Huang
Hai Wu
Zhiyan Liu
Xiaojuan Qi
190
13
0
07 Oct 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Shigang Li
Kazuki Osawa
Torsten Hoefler
384
45
0
14 Sep 2022
Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural Networks
ACM Transactions on Embedded Computing Systems (TECS), 2022
Francesco Daghero
Luca Bompani
Chen Xie
Marco Castellano
Luca Gandolfi
A. Calimera
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
BDL
HAI
117
33
0
02 Sep 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Micro (MICRO), 2022
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
170
93
0
30 Aug 2022
SONAR: Joint Architecture and System Optimization Search
Elias Jääsaari
Michelle Ma
Ameet Talwalkar
Tianqi Chen
154
1
0
25 Aug 2022
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Neural Information Processing Systems (NeurIPS), 2022
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
431
321
0
24 Aug 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
257
4
0
22 Aug 2022
Combining Gradients and Probabilities for Heterogeneous Approximation of Neural Networks
E. Trommer
Bernd Waschneck
Akash Kumar
119
9
0
15 Aug 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
292
19
0
11 Aug 2022
Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization
International Conference on Field-Programmable Logic and Applications (FPL), 2022
Hao Sun
Mengshu Sun
Alec Lu
Haoyu Ma
Geng Yuan
...
Yanyu Li
M. Leeser
Zinan Lin
Xue Lin
Zhenman Fang
ViT
MQ
151
68
0
10 Aug 2022
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA
International Conference on Field-Programmable Logic and Applications (FPL), 2022
Cecilia Latotzke
Tim Ciesielski
T. Gemmeke
MQ
153
12
0
09 Aug 2022
Quantized Sparse Weight Decomposition for Neural Network Compression
Andrey Kuzmin
M. V. Baalen
Markus Nagel
Arash Behboodi
MQ
136
3
0
22 Jul 2022
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
European Conference on Computer Vision (ECCV), 2022
Chee Hong
Sungyong Baik
Heewon Kim
Seungjun Nah
Kyoung Mu Lee
SupR
MQ
246
39
0
21 Jul 2022
Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach
European Conference on Computer Vision (ECCV), 2022
Jiseok Youn
Jaehun Song
Hyung-Sin Kim
S. Bahk
MQ
161
10
0
20 Jul 2022
Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss
Daning Cheng
Wenguang Chen
MQ
159
0
0
20 Jul 2022
Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation
IEEE Workshop on Signal Processing Systems (SiPS), 2022
Yu-Shan Tai
Cheng-Yang Chang
Chieh-Fang Teng
AnYeu
A. Wu
196
5
0
16 Jul 2022
STI: Turbocharge NLP Inference at the Edge via Elastic Pipelining
International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Liwei Guo
Wonkyo Choe
F. Lin
174
21
0
11 Jul 2022
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yongming Rao
Zuyan Liu
Wenliang Zhao
Jie Zhou
Jiwen Lu
ViT
218
50
0
04 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
IEEE International Conference on Computer Vision (ICCV), 2022
Zhikai Li
Qingyi Gu
MQ
386
146
0
04 Jul 2022
On-Device Training Under 256KB Memory
Neural Information Processing Systems (NeurIPS), 2022
Ji Lin
Ligeng Zhu
Wei-Ming Chen
Wei-Chen Wang
Chuang Gan
Song Han
MQ
409
259
0
30 Jun 2022
QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-Exploration
ACM Transactions on Embedded Computing Systems (TECS), 2022
A. Inci
Siri Garudanagiri Virupaksha
Aman Jain
Ting-Wu Chin
Venkata Vivek Thallam
Ruizhou Ding
Diana Marculescu
MQ
174
5
0
30 Jun 2022
Computational Complexity Evaluation of Neural Network Applications in Signal Processing
Pedro J. Freire
S. Srivallapanondh
A. Napoli
Jaroslaw E. Prilepsky
S. Turitsyn
171
1
0
24 Jun 2022
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes
International Green and Sustainable Computing Conference (GSC), 2022
Matteo Risso
Luca Bompani
Luca Benini
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
MQ
190
14
0
17 Jun 2022
Edge Inference with Fully Differentiable Quantized Mixed Precision Neural Networks
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Clemens J. S. Schaefer
Siddharth Joshi
Shane Li
Raul Blazquez
MQ
162
14
0
15 Jun 2022
SDQ: Stochastic Differentiable Quantization with Mixed Precision
International Conference on Machine Learning (ICML), 2022
Xijie Huang
Zhiqiang Shen
Shichao Li
Zechun Liu
Xianghong Hu
Jeffry Wicaksana
Eric P. Xing
Kwang-Ting Cheng
MQ
358
44
0
09 Jun 2022
NIPQ: Noise proxy-based Integrated Pseudo-Quantization
Computer Vision and Pattern Recognition (CVPR), 2022
Juncheol Shin
Junhyuk So
Sein Park
Seungyeop Kang
S. Yoo
Eunhyeok Park
166
41
0
02 Jun 2022
Machine Learning for Microcontroller-Class Hardware: A Review
IEEE Sensors Journal (IEEE Sens. J.), 2022
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
517
169
0
29 May 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
AAAI Conference on Artificial Intelligence (AAAI), 2021
Peng Hu
Xi Peng
Erik Cambria
M. Aly
Jie Lin
MQ
228
73
0
23 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2022
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
385
170
0
14 May 2022
Fast Conditional Network Compression Using Bayesian HyperNetworks
Phuoc Nguyen
T. Tran
Ky Le
Sunil R. Gupta
Santu Rana
Dang Nguyen
Trong Nguyen
S. Ryan
Svetha Venkatesh
BDL
119
7
0
13 May 2022
Revisiting Random Channel Pruning for Neural Network Compression
Computer Vision and Pattern Recognition (CVPR), 2022
Yawei Li
Kamil Adamczewski
Wen Li
Shuhang Gu
Radu Timofte
Luc Van Gool
197
107
0
11 May 2022
A Collaboration Strategy in the Mining Pool for Proof-of-Neural-Architecture Consensus
Boyang Albert Li
Qing Lu
Weiwen Jiang
Taeho Jung
Yiyu Shi
140
7
0
05 May 2022
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Computer Vision and Pattern Recognition (CVPR), 2022
Yihan Wang
Zhekai Zhang
Han Cai
Wei-Ming Chen
Song Han
3DH
395
86
0
03 May 2022
PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Zhijian Liu
Haotian Tang
Shengyu Zhao
Kevin Shao
Song Han
3DPC
152
48
0
25 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
230
132
0
25 Apr 2022
SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems
Computer Vision and Pattern Recognition (CVPR), 2022
Xin Dong
B. D. Salvo
Meng Li
Chiao Liu
Zhongnan Qu
H. T. Kung
Ziyun Li
3DGS
155
24
0
10 Apr 2022
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
International Conference on Learning Representations (ICLR), 2022
Sharath Girish
Kamal Gupta
Saurabh Singh
Abhinav Shrivastava
192
12
0
06 Apr 2022
REx: Data-Free Residual Quantization Error Expansion
Neural Information Processing Systems (NeurIPS), 2022
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
322
9
0
28 Mar 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
IEEE Access (IEEE Access), 2022
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
136
8
0
22 Mar 2022
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference
Zhongzhi Yu
Y. Fu
Shang Wu
Mengquan Li
Haoran You
Yingyan Lin
162
2
0
15 Mar 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
International Conference on Learning Representations (ICLR), 2022
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
319
54
0
10 Feb 2022
Quantization in Layer's Input is Matter
Daning Cheng
Wenguang Chen
MQ
115
0
0
10 Feb 2022
Previous
1
2
3
4
5
6
...
8
9
10
Next