ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

21 November 2018
Kuan-Chieh Jackson Wang
Zhijian Liu
Yujun Lin
Ji Lin
Song Han
    MQ
ArXivPDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 435 papers shown
Title
OMPQ: Orthogonal Mixed Precision Quantization
OMPQ: Orthogonal Mixed Precision Quantization
Yuexiao Ma
Taisong Jin
Xiawu Zheng
Yan Wang
Huixia Li
Yongjian Wu
Guannan Jiang
Wei Zhang
Rongrong Ji
MQ
19
33
0
16 Sep 2021
Elastic Significant Bit Quantization and Acceleration for Deep Neural
  Networks
Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks
Cheng Gong
Ye Lu
Kunpeng Xie
Zongming Jin
Tao Li
Yanzhi Wang
MQ
25
7
0
08 Sep 2021
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing
  Deep Neural Networks for Wearables
BioNetExplorer: Architecture-Space Exploration of Bio-Signal Processing Deep Neural Networks for Wearables
B. Prabakaran
Asima Akhtar
Semeen Rehman
Osman Hasan
Muhammad Shafique
11
9
0
07 Sep 2021
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
  Quantization Loss
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss
J. H. Lee
Jihun Yun
S. Hwang
Eunho Yang
MQ
20
14
0
05 Sep 2021
Architecture Aware Latency Constrained Sparse Neural Networks
Architecture Aware Latency Constrained Sparse Neural Networks
Tianli Zhao
Qinghao Hu
Xiangyu He
Weixiang Xu
Jiaxing Wang
Cong Leng
Jian Cheng
31
0
0
01 Sep 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on
  Recent Advances and New Directions
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
Yang Wu
Dingheng Wang
Xiaotong Lu
Fan Yang
Guoqi Li
W. Dong
Jianbo Shi
29
18
0
30 Aug 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
DKM: Differentiable K-Means Clustering Layer for Neural Network
  Compression
DKM: Differentiable K-Means Clustering Layer for Neural Network Compression
Minsik Cho
Keivan Alizadeh Vahid
Saurabh N. Adya
Mohammad Rastegari
23
34
0
28 Aug 2021
Dynamic Network Quantization for Efficient Video Inference
Dynamic Network Quantization for Efficient Video Inference
Ximeng Sun
Rameswar Panda
Chun-Fu Chen
A. Oliva
Rogerio Feris
Kate Saenko
29
45
0
23 Aug 2021
On the Acceleration of Deep Neural Network Inference using Quantized
  Compressed Sensing
On the Acceleration of Deep Neural Network Inference using Quantized Compressed Sensing
Meshia Cédric Oveneke
MQ
19
0
0
23 Aug 2021
Online Multi-Granularity Distillation for GAN Compression
Online Multi-Granularity Distillation for GAN Compression
Yuxi Ren
Jie Wu
Xuefeng Xiao
Jianchao Yang
20
38
0
16 Aug 2021
Generalizable Mixed-Precision Quantization via Attribution Rank
  Preservation
Generalizable Mixed-Precision Quantization via Attribution Rank Preservation
Ziwei Wang
Han Xiao
Jiwen Lu
Jie Zhou
MQ
16
32
0
05 Aug 2021
MOHAQ: Multi-Objective Hardware-Aware Quantization of Recurrent Neural
  Networks
MOHAQ: Multi-Objective Hardware-Aware Quantization of Recurrent Neural Networks
Nesma M. Rezk
Tomas Nordstrom
D. Stathis
Z. Ul-Abdin
E. Aksoy
A. Hemani
MQ
20
1
0
02 Aug 2021
Pruning Ternary Quantization
Danyang Liu
Xiangshan Chen
Jie Fu
Chen-li Ma
Xue Liu
MQ
36
0
0
23 Jul 2021
LANA: Latency Aware Network Acceleration
LANA: Latency Aware Network Acceleration
Pavlo Molchanov
Jimmy Hall
Hongxu Yin
Jan Kautz
Nicolò Fusi
Arash Vahdat
25
11
0
12 Jul 2021
HEMP: High-order Entropy Minimization for neural network comPression
HEMP: High-order Entropy Minimization for neural network comPression
Enzo Tartaglione
Stéphane Lathuilière
A. Fiandrotti
Marco Cagnazzo
Marco Grangetto
MQ
16
7
0
12 Jul 2021
Post-Training Quantization for Vision Transformer
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
41
325
0
27 Jun 2021
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU
  Tensor Cores
APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores
Boyuan Feng
Yuke Wang
Tong Geng
Ang Li
Yufei Ding
MQ
13
37
0
23 Jun 2021
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision
  Quantization
Neuroevolution-Enhanced Multi-Objective Optimization for Mixed-Precision Quantization
Santiago Miret
Vui Seng Chua
Mattias Marder
Mariano Phielipp
Nilesh Jain
Somdeb Majumdar
13
8
0
14 Jun 2021
Sparse PointPillars: Maintaining and Exploiting Input Sparsity to
  Improve Runtime on Embedded Systems
Sparse PointPillars: Maintaining and Exploiting Input Sparsity to Improve Runtime on Embedded Systems
Kyle Vedder
Eric Eaton
3DPC
11
13
0
12 Jun 2021
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Yonggan Fu
Yongan Zhang
Yang Zhang
David D. Cox
Yingyan Lin
MQ
52
17
0
11 Jun 2021
DynamicViT: Efficient Vision Transformers with Dynamic Token
  Sparsification
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
ViT
32
664
0
03 Jun 2021
RED : Looking for Redundancies for Data-Free Structured Compression of
  Deep Neural Networks
RED : Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
CVBM
19
20
0
31 May 2021
NAAS: Neural Accelerator Architecture Search
NAAS: Neural Accelerator Architecture Search
Yujun Lin
Mengtian Yang
Song Han
26
58
0
27 May 2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference
  at Scale
Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale
Zhaoxia Deng
Deng
Jongsoo Park
P. T. P. Tang
Haixin Liu
...
S. Nadathur
Changkyu Kim
Maxim Naumov
S. Naghshineh
M. Smelyanskiy
15
11
0
26 May 2021
DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural
  Networks for Edge Vision Applications
DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications
Tao Luo
Wai Teng Tang
Matthew Kay Fei Lee
Chuping Qu
Weng-Fai Wong
Rick Siow Mong Goh
22
0
0
25 May 2021
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer
Haoping Bai
Mengsi Cao
Ping-Chia Huang
Jiulong Shan
MQ
17
34
0
19 May 2021
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
Pareto-Optimal Quantized ResNet Is Mostly 4-bit
AmirAli Abdolrashidi
Lisa Wang
Shivani Agrawal
J. Malmaud
Oleg Rybakov
Chas Leichner
Lukasz Lew
MQ
26
35
0
07 May 2021
On the Adversarial Robustness of Quantized Neural Networks
On the Adversarial Robustness of Quantized Neural Networks
Micah Gorsline
James T. Smith
Cory E. Merkel
AAML
54
18
0
01 May 2021
HAO: Hardware-aware neural Architecture Optimization for Efficient
  Inference
HAO: Hardware-aware neural Architecture Optimization for Efficient Inference
Zhen Dong
Yizhao Gao
Qijing Huang
J. Wawrzynek
Hayden Kwok-Hay So
Kurt Keutzer
16
34
0
26 Apr 2021
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural
  Networks
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural Networks
Sayeed Shafayet Chowdhury
Isha Garg
Kaushik Roy
21
38
0
26 Apr 2021
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
Yonggan Fu
Zhongzhi Yu
Yongan Zhang
Yifan Jiang
Chaojian Li
Yongyuan Liang
Mingchao Jiang
Zhangyang Wang
Yingyan Lin
20
3
0
22 Apr 2021
Differentiable Model Compression via Pseudo Quantization Noise
Differentiable Model Compression via Pseudo Quantization Noise
Alexandre Défossez
Yossi Adi
Gabriel Synnaeve
DiffM
MQ
15
47
0
20 Apr 2021
Coarse-to-Fine Searching for Efficient Generative Adversarial Networks
Coarse-to-Fine Searching for Efficient Generative Adversarial Networks
Jiahao Wang
Han Shu
Weihao Xia
Yujiu Yang
Yunhe Wang
GAN
24
5
0
19 Apr 2021
TENT: Efficient Quantization of Neural Networks on the tiny Edge with
  Tapered FixEd PoiNT
TENT: Efficient Quantization of Neural Networks on the tiny Edge with Tapered FixEd PoiNT
H. F. Langroudi
Vedant Karia
Tej Pandit
Dhireesha Kudithipudi
MQ
19
10
0
06 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Hervé Jégou
Matthijs Douze
ViT
16
768
0
02 Apr 2021
Network Quantization with Element-wise Gradient Scaling
Network Quantization with Element-wise Gradient Scaling
Junghyup Lee
Dohyung Kim
Bumsub Ham
MQ
10
115
0
02 Apr 2021
Training Multi-bit Quantized and Binarized Networks with A Learnable
  Symmetric Quantizer
Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric Quantizer
Phuoc Pham
J. Abraham
Jaeyong Chung
MQ
33
11
0
01 Apr 2021
Bit-Mixer: Mixed-precision networks with runtime bit-width selection
Bit-Mixer: Mixed-precision networks with runtime bit-width selection
Adrian Bulat
Georgios Tzimiropoulos
MQ
18
27
0
31 Mar 2021
RCT: Resource Constrained Training for Edge AI
RCT: Resource Constrained Training for Edge AI
Tian Huang
Tao Luo
Ming Yan
Joey Tianyi Zhou
Rick Siow Mong Goh
25
8
0
26 Mar 2021
n-hot: Efficient bit-level sparsity for powers-of-two neural network
  quantization
n-hot: Efficient bit-level sparsity for powers-of-two neural network quantization
Yuiko Sakuma
Hiroshi Sumihiro
Jun Nishikawa
Toshiki Nakamura
Ryoji Ikegaya
MQ
35
1
0
22 Mar 2021
Data-free mixed-precision quantization using novel sensitivity metric
Data-free mixed-precision quantization using novel sensitivity metric
Donghyun Lee
M. Cho
Seungwon Lee
Joonho Song
Changkyu Choi
MQ
14
2
0
18 Mar 2021
Environmental Sound Classification on the Edge: A Pipeline for Deep
  Acoustic Networks on Extremely Resource-Constrained Devices
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices
Md Mohaimenuzzaman
Christoph Bergmeir
I. West
B. Meyer
12
41
0
05 Mar 2021
Anycost GANs for Interactive Image Synthesis and Editing
Anycost GANs for Interactive Image Synthesis and Editing
Ji Lin
Richard Y. Zhang
F. Ganz
Song Han
Jun-Yan Zhu
36
83
0
04 Mar 2021
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space
  Search
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search
Kartik Hegde
Po-An Tsai
Sitao Huang
Vikas Chandra
A. Parashar
Christopher W. Fletcher
26
90
0
02 Mar 2021
Improved Techniques for Quantizing Deep Networks with Adaptive
  Bit-Widths
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
Ximeng Sun
Rameswar Panda
Chun-Fu Chen
Naigang Wang
Bowen Pan
Bowen Pan Kailash Gopalakrishnan
A. Oliva
Rogerio Feris
Kate Saenko
MQ
19
4
0
02 Mar 2021
FjORD: Fair and Accurate Federated Learning under heterogeneous targets
  with Ordered Dropout
FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout
Samuel Horváth
Stefanos Laskaridis
Mario Almeida
Ilias Leondiadis
Stylianos I. Venieris
Nicholas D. Lane
181
267
0
26 Feb 2021
Ps and Qs: Quantization-aware pruning for efficient low latency neural
  network inference
Ps and Qs: Quantization-aware pruning for efficient low latency neural network inference
B. Hawks
Javier Mauricio Duarte
Nicholas J. Fraser
Alessandro Pappalardo
N. Tran
Yaman Umuroglu
MQ
6
51
0
22 Feb 2021
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network
  Quantization
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization
Huanrui Yang
Lin Duan
Yiran Chen
Hai Helen Li
MQ
11
64
0
20 Feb 2021
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision
  Neural Networks
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision Neural Networks
Ben Bodner
G. B. Shalom
Eran Treister
MQ
19
2
0
18 Feb 2021
Previous
123456789
Next