ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Computer Vision and Pattern Recognition (CVPR), 2018
21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
    MQ
ArXiv (abs)PDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 464 papers shown
Title
MinUn: Accurate ML Inference on Microcontrollers
MinUn: Accurate ML Inference on MicrocontrollersACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2022
Shikhar Jaiswal
R. Goli
Aayan Kumar
Vivek Seshadri
Rahul Sharma
265
5
0
29 Oct 2022
Fast DistilBERT on CPUs
Fast DistilBERT on CPUs
Haihao Shen
Ofir Zafrir
Bo Dong
Hengyu Meng
Xinyu. Ye
Zhe Wang
Yi Ding
Hanwen Chang
Guy Boudoukh
Moshe Wasserblat
VLM
237
2
0
27 Oct 2022
Zero-Shot Learning of a Conditional Generative Adversarial Network for
  Data-Free Network Quantization
Zero-Shot Learning of a Conditional Generative Adversarial Network for Data-Free Network QuantizationInternational Conference on Information Photonics (ICIP), 2021
Yoojin Choi
Mostafa El-Khamy
Jungwon Lee
GAN
145
1
0
26 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
Approximating Continuous Convolutions for Deep Network CompressionBritish Machine Vision Conference (BMVC), 2022
Theo W. Costain
V. Prisacariu
159
0
0
17 Oct 2022
ODG-Q: Robust Quantization via Online Domain Generalization
ODG-Q: Robust Quantization via Online Domain GeneralizationInternational Conference on Pattern Recognition (ICPR), 2022
Chaofan Tao
Ngai Wong
MQ
137
1
0
17 Oct 2022
FIT: A Metric for Model Sensitivity
FIT: A Metric for Model SensitivityInternational Conference on Learning Representations (ICLR), 2022
Ben Zandonati
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
232
9
0
16 Oct 2022
Deep learning model compression using network sensitivity and gradients
Deep learning model compression using network sensitivity and gradients
M. Sakthi
N. Yadla
Raj Pawate
160
2
0
11 Oct 2022
Energy-Efficient Deployment of Machine Learning Workloads on
  Neuromorphic Hardware
Energy-Efficient Deployment of Machine Learning Workloads on Neuromorphic HardwareInternational Green and Sustainable Computing Conference (GSC), 2022
Peyton S. Chandarana
Mohammadreza Mohammadi
J. Seekings
Ramtin Zand
208
8
0
10 Oct 2022
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile
  Networks
In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile NetworksIEEE wireless communications (IEEE Wireless Commun.), 2022
Kaibin Huang
Hai Wu
Zhiyan Liu
Xiaojuan Qi
190
13
0
07 Oct 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Efficient Quantized Sparse Matrix Operations on Tensor CoresInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022
Shigang Li
Kazuki Osawa
Torsten Hoefler
384
45
0
14 Sep 2022
Human Activity Recognition on Microcontrollers with Quantized and
  Adaptive Deep Neural Networks
Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural NetworksACM Transactions on Embedded Computing Systems (TECS), 2022
Francesco Daghero
Luca Bompani
Chen Xie
Marco Castellano
Luca Gandolfi
A. Calimera
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
BDLHAI
117
33
0
02 Sep 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural
  Network Quantization
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network QuantizationMicro (MICRO), 2022
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
170
93
0
30 Aug 2022
SONAR: Joint Architecture and System Optimization Search
SONAR: Joint Architecture and System Optimization Search
Elias Jääsaari
Michelle Ma
Ameet Talwalkar
Tianqi Chen
154
1
0
25 Aug 2022
Optimal Brain Compression: A Framework for Accurate Post-Training
  Quantization and Pruning
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and PruningNeural Information Processing Systems (NeurIPS), 2022
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
431
321
0
24 Aug 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning
  Models: A Survey
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
257
4
0
22 Aug 2022
Combining Gradients and Probabilities for Heterogeneous Approximation of
  Neural Networks
Combining Gradients and Probabilities for Heterogeneous Approximation of Neural Networks
E. Trommer
Bernd Waschneck
Akash Kumar
119
9
0
15 Aug 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
292
19
0
11 Aug 2022
Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision
  Transformer with Mixed-Scheme Quantization
Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme QuantizationInternational Conference on Field-Programmable Logic and Applications (FPL), 2022
Hao Sun
Mengshu Sun
Alec Lu
Haoyu Ma
Geng Yuan
...
Yanyu Li
M. Leeser
Zinan Lin
Xue Lin
Zhenman Fang
ViTMQ
151
68
0
10 Aug 2022
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGAInternational Conference on Field-Programmable Logic and Applications (FPL), 2022
Cecilia Latotzke
Tim Ciesielski
T. Gemmeke
MQ
153
12
0
09 Aug 2022
Quantized Sparse Weight Decomposition for Neural Network Compression
Quantized Sparse Weight Decomposition for Neural Network Compression
Andrey Kuzmin
M. V. Baalen
Markus Nagel
Arash Behboodi
MQ
136
3
0
22 Jul 2022
CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution
CADyQ: Content-Aware Dynamic Quantization for Image Super-ResolutionEuropean Conference on Computer Vision (ECCV), 2022
Chee Hong
Sungyong Baik
Heewon Kim
Seungjun Nah
Kyoung Mu Lee
SupRMQ
246
39
0
21 Jul 2022
Bitwidth-Adaptive Quantization-Aware Neural Network Training: A
  Meta-Learning Approach
Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning ApproachEuropean Conference on Computer Vision (ECCV), 2022
Jiseok Youn
Jaehun Song
Hyung-Sin Kim
S. Bahk
MQ
161
10
0
20 Jul 2022
Mixed-Precision Inference Quantization: Radically Towards Faster
  inference speed, Lower Storage requirement, and Lower Loss
Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss
Daning Cheng
Wenguang Chen
MQ
159
0
0
20 Jul 2022
Learnable Mixed-precision and Dimension Reduction Co-design for
  Low-storage Activation
Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage ActivationIEEE Workshop on Signal Processing Systems (SiPS), 2022
Yu-Shan Tai
Cheng-Yang Chang
Chieh-Fang Teng
AnYeu
A. Wu
196
5
0
16 Jul 2022
STI: Turbocharge NLP Inference at the Edge via Elastic Pipelining
STI: Turbocharge NLP Inference at the Edge via Elastic PipeliningInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022
Liwei Guo
Wonkyo Choe
F. Lin
174
21
0
11 Jul 2022
Dynamic Spatial Sparsification for Efficient Vision Transformers and
  Convolutional Neural Networks
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yongming Rao
Zuyan Liu
Wenliang Zhao
Jie Zhou
Jiwen Lu
ViT
218
50
0
04 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer InferenceIEEE International Conference on Computer Vision (ICCV), 2022
Zhikai Li
Qingyi Gu
MQ
386
146
0
04 Jul 2022
On-Device Training Under 256KB Memory
On-Device Training Under 256KB MemoryNeural Information Processing Systems (NeurIPS), 2022
Ji Lin
Ligeng Zhu
Wei-Ming Chen
Wei-Chen Wang
Chuang Gan
Song Han
MQ
409
259
0
30 Jun 2022
QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model
  Co-Exploration
QUIDAM: A Framework for Quantization-Aware DNN Accelerator and Model Co-ExplorationACM Transactions on Embedded Computing Systems (TECS), 2022
A. Inci
Siri Garudanagiri Virupaksha
Aman Jain
Ting-Wu Chin
Venkata Vivek Thallam
Ruizhou Ding
Diana Marculescu
MQ
174
5
0
30 Jun 2022
Computational Complexity Evaluation of Neural Network Applications in
  Signal Processing
Computational Complexity Evaluation of Neural Network Applications in Signal Processing
Pedro J. Freire
S. Srivallapanondh
A. Napoli
Jaroslaw E. Prilepsky
S. Turitsyn
171
1
0
24 Jun 2022
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained
  Edge Nodes
Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge NodesInternational Green and Sustainable Computing Conference (GSC), 2022
Matteo Risso
Luca Bompani
Luca Benini
Enrico Macii
Massimo Poncino
Daniele Jahier Pagliari
MQ
190
14
0
17 Jun 2022
Edge Inference with Fully Differentiable Quantized Mixed Precision
  Neural Networks
Edge Inference with Fully Differentiable Quantized Mixed Precision Neural NetworksIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Clemens J. S. Schaefer
Siddharth Joshi
Shane Li
Raul Blazquez
MQ
162
14
0
15 Jun 2022
SDQ: Stochastic Differentiable Quantization with Mixed Precision
SDQ: Stochastic Differentiable Quantization with Mixed PrecisionInternational Conference on Machine Learning (ICML), 2022
Xijie Huang
Zhiqiang Shen
Shichao Li
Zechun Liu
Xianghong Hu
Jeffry Wicaksana
Eric P. Xing
Kwang-Ting Cheng
MQ
358
44
0
09 Jun 2022
NIPQ: Noise proxy-based Integrated Pseudo-Quantization
NIPQ: Noise proxy-based Integrated Pseudo-QuantizationComputer Vision and Pattern Recognition (CVPR), 2022
Juncheol Shin
Junhyuk So
Sein Park
Seungyeop Kang
S. Yoo
Eunhyeok Park
166
41
0
02 Jun 2022
Machine Learning for Microcontroller-Class Hardware: A Review
Machine Learning for Microcontroller-Class Hardware: A ReviewIEEE Sensors Journal (IEEE Sens. J.), 2022
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
517
169
0
29 May 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
OPQ: Compressing Deep Neural Networks with One-shot Pruning-QuantizationAAAI Conference on Artificial Intelligence (AAAI), 2021
Peng Hu
Xi Peng
Erik Cambria
M. Aly
Jie Lin
MQ
228
73
0
23 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in
  Image Classification
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image ClassificationACM Transactions on Intelligent Systems and Technology (ACM TIST), 2022
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
385
170
0
14 May 2022
Fast Conditional Network Compression Using Bayesian HyperNetworks
Fast Conditional Network Compression Using Bayesian HyperNetworks
Phuoc Nguyen
T. Tran
Ky Le
Sunil R. Gupta
Santu Rana
Dang Nguyen
Trong Nguyen
S. Ryan
Svetha Venkatesh
BDL
119
7
0
13 May 2022
Revisiting Random Channel Pruning for Neural Network Compression
Revisiting Random Channel Pruning for Neural Network CompressionComputer Vision and Pattern Recognition (CVPR), 2022
Yawei Li
Kamil Adamczewski
Wen Li
Shuhang Gu
Radu Timofte
Luc Van Gool
197
107
0
11 May 2022
A Collaboration Strategy in the Mining Pool for
  Proof-of-Neural-Architecture Consensus
A Collaboration Strategy in the Mining Pool for Proof-of-Neural-Architecture Consensus
Boyang Albert Li
Qing Lu
Weiwen Jiang
Taeho Jung
Yiyu Shi
140
7
0
05 May 2022
Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Lite Pose: Efficient Architecture Design for 2D Human Pose EstimationComputer Vision and Pattern Recognition (CVPR), 2022
Yihan Wang
Zhekai Zhang
Han Cai
Wei-Ming Chen
Song Han
3DH
395
86
0
03 May 2022
PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution
PVNAS: 3D Neural Architecture Search with Point-Voxel ConvolutionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Zhijian Liu
Haotian Tang
Shengyu Zhao
Kevin Shao
Song Han
3DPC
152
48
0
25 Apr 2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and
  Applications
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
230
132
0
25 Apr 2022
SplitNets: Designing Neural Architectures for Efficient Distributed
  Computing on Head-Mounted Systems
SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted SystemsComputer Vision and Pattern Recognition (CVPR), 2022
Xin Dong
B. D. Salvo
Meng Li
Chiao Liu
Zhongnan Qu
H. T. Kung
Ziyun Li
3DGS
155
24
0
10 Apr 2022
LilNetX: Lightweight Networks with EXtreme Model Compression and
  Structured Sparsification
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured SparsificationInternational Conference on Learning Representations (ICLR), 2022
Sharath Girish
Kamal Gupta
Saurabh Singh
Abhinav Shrivastava
192
12
0
06 Apr 2022
REx: Data-Free Residual Quantization Error Expansion
REx: Data-Free Residual Quantization Error ExpansionNeural Information Processing Systems (NeurIPS), 2022
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
322
9
0
28 Mar 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed
  Low-Precision DNNs with Dynamic Fixed-Point Representation
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point RepresentationIEEE Access (IEEE Access), 2022
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
136
8
0
22 Mar 2022
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network
  Training and Inference
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference
Zhongzhi Yu
Y. Fu
Shang Wu
Mengquan Li
Haoran You
Yingyan Lin
162
2
0
15 Mar 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
F8Net: Fixed-Point 8-bit Only Multiplication for Network QuantizationInternational Conference on Learning Representations (ICLR), 2022
Qing Jin
Jian Ren
Richard Zhuang
Sumant Hanumante
Zhengang Li
Zhiyu Chen
Yanzhi Wang
Kai-Min Yang
Sergey Tulyakov
MQ
319
54
0
10 Feb 2022
Quantization in Layer's Input is Matter
Quantization in Layer's Input is Matter
Daning Cheng
Wenguang Chen
MQ
115
0
0
10 Feb 2022
Previous
123456...8910
Next