v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Computer Vision and Pattern Recognition (CVPR), 2018

21 November 2018

Zhijian Liu

Song Han

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 464 papers shown

MinUn: Accurate ML Inference on MicrocontrollersACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2022

284

29 Oct 2022

Fast DistilBERT on CPUs

243

27 Oct 2022

Zero-Shot Learning of a Conditional Generative Adversarial Network for Data-Free Network QuantizationInternational Conference on Information Photonics (ICIP), 2021

153

26 Oct 2022

Approximating Continuous Convolutions for Deep Network CompressionBritish Machine Vision Conference (BMVC), 2022

Theo W. Costain

V. Prisacariu

174

17 Oct 2022

ODG-Q: Robust Quantization via Online Domain GeneralizationInternational Conference on Pattern Recognition (ICPR), 2022

Chaofan Tao

Ngai Wong

156

17 Oct 2022

FIT: A Metric for Model SensitivityInternational Conference on Learning Representations (ICLR), 2022

253

16 Oct 2022

Deep learning model compression using network sensitivity and gradients

M. Sakthi

N. Yadla

Raj Pawate

168

11 Oct 2022

Energy-Efficient Deployment of Machine Learning Workloads on Neuromorphic HardwareInternational Green and Sustainable Computing Conference (GSC), 2022

Peyton S. Chandarana

Mohammadreza Mohammadi

J. Seekings

Ramtin Zand

210

10 Oct 2022

In-situ Model Downloading to Realize Versatile Edge AI in 6G Mobile NetworksIEEE wireless communications (IEEE Wireless Commun.), 2022

Kaibin Huang

Hai Wu

Zhiyan Liu

Xiaojuan Qi

192

07 Oct 2022

Efficient Quantized Sparse Matrix Operations on Tensor CoresInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022

Shigang Li

Kazuki Osawa

Torsten Hoefler

415

14 Sep 2022

Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural NetworksACM Transactions on Embedded Computing Systems (TECS), 2022

Daniele Jahier Pagliari

BDL HAI

118

02 Sep 2022

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network QuantizationMicro (MICRO), 2022

Cong Guo

Chen Zhang

Jingwen Leng

Zihan Liu

Fan Yang

Yun-Bo Liu

Minyi Guo

Yuhao Zhu

177

30 Aug 2022

SONAR: Joint Architecture and System Optimization Search

184

25 Aug 2022

Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and PruningNeural Information Processing Systems (NeurIPS), 2022

Elias Frantar

Sidak Pal Singh

Dan Alistarh

436

322

24 Aug 2022

Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey

269

22 Aug 2022

Combining Gradients and Probabilities for Heterogeneous Approximation of Neural Networks

E. Trommer

Bernd Waschneck

Akash Kumar

145

15 Aug 2022

Mixed-Precision Neural Networks: A Survey

M. Rakka

M. Fouda

Pramod P. Khargonekar

Fadi J. Kurdahi

296

11 Aug 2022

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme QuantizationInternational Conference on Field-Programmable Logic and Applications (FPL), 2022

...

160

10 Aug 2022

Design of High-Throughput Mixed-Precision CNN Accelerators on FPGAInternational Conference on Field-Programmable Logic and Applications (FPL), 2022

Cecilia Latotzke

Tim Ciesielski

T. Gemmeke

175

09 Aug 2022

Quantized Sparse Weight Decomposition for Neural Network Compression

141

22 Jul 2022

CADyQ: Content-Aware Dynamic Quantization for Image Super-ResolutionEuropean Conference on Computer Vision (ECCV), 2022

Kyoung Mu Lee

255

21 Jul 2022

Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning ApproachEuropean Conference on Computer Vision (ECCV), 2022

171

20 Jul 2022

Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss

Daning Cheng

Wenguang Chen

181

20 Jul 2022

Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage ActivationIEEE Workshop on Signal Processing Systems (SiPS), 2022

198

16 Jul 2022

STI: Turbocharge NLP Inference at the Edge via Elastic PipeliningInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022

Liwei Guo

Wonkyo Choe

F. Lin

194

11 Jul 2022