ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

21 November 2018
Kuan-Chieh Jackson Wang
Zhijian Liu
Yujun Lin
Ji Lin
Song Han
    MQ
ArXivPDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 435 papers shown
Title
EQ-Net: Elastic Quantization Neural Networks
EQ-Net: Elastic Quantization Neural Networks
Ke Xu
Lei Han
Ye Tian
Shangshang Yang
Xingyi Zhang
MQ
37
7
0
15 Aug 2023
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of
  Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
Seyedarmin Azizi
M. Nazemi
A. Fayyazi
Massoud Pedram
MQ
19
5
0
12 Aug 2023
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust
  Neural Network Inference
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
6
2
0
09 Aug 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization
  Search
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng R. Li
MQ
25
3
0
07 Aug 2023
Dynamic Token-Pass Transformers for Semantic Segmentation
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu
Qiang Zhou
Jing Wang
Fan Wang
J. Wang
Wei Zhang
ViT
20
5
0
03 Aug 2023
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation
Marc Botet Colomer
Pier Luigi Dovesi
Theodoros Panagiotakopoulos
J. Carvalho
Linus Harenstam-Nielsen
Hossein Azizpour
Hedvig Kjellström
Daniel Cremers
Matteo Poggi
TTA
25
9
0
27 Jul 2023
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution
  Networks
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
Chee Hong
Kyoung Mu Lee
SupR
MQ
19
1
0
25 Jul 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision
  Quantization
EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
Peijie Dong
Lujun Li
Zimian Wei
Xin-Yi Niu
Zhiliang Tian
H. Pan
MQ
45
28
0
20 Jul 2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
Vasileios Leon
Muhammad Abdullah Hanif
Giorgos Armeniakos
Xun Jiao
Muhammad Shafique
K. Pekmestzi
Dimitrios Soudris
29
3
0
20 Jul 2023
PLiNIO: A User-Friendly Library of Gradient-based Methods for
  Complexity-aware DNN Optimization
PLiNIO: A User-Friendly Library of Gradient-based Methods for Complexity-aware DNN Optimization
Daniele Jahier Pagliari
Matteo Risso
Beatrice Alessandra Motetti
Alessio Burrello
19
8
0
18 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
40
62
0
16 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
16
5
0
10 Jul 2023
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN
  Inference
DNA-TEQ: An Adaptive Exponential Quantization of Tensors for DNN Inference
Bahareh Khabbazan
Marc Riera
Antonio González
MQ
13
3
0
28 Jun 2023
Precision-aware Latency and Energy Balancing on Multi-Accelerator
  Platforms for DNN Inference
Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference
Matteo Risso
Alessio Burrello
G. M. Sarda
Luca Benini
Enrico Macii
M. Poncino
Marian Verhelst
Daniele Jahier Pagliari
28
4
0
08 Jun 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision
  Post-Training Quantization
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
28
2
0
08 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
36
466
0
01 Jun 2023
DynaShare: Task and Instance Conditioned Parameter Sharing for
  Multi-Task Learning
DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning
E. Rahimian
Golara Javadi
Frederick Tung
Gabriel L. Oliveira
MoE
22
2
0
26 May 2023
MixFormerV2: Efficient Fully Transformer Tracking
MixFormerV2: Efficient Fully Transformer Tracking
Yutao Cui
Tian-Shu Song
Gangshan Wu
Liming Wang
21
53
0
25 May 2023
PQA: Exploring the Potential of Product Quantization in DNN Hardware
  Acceleration
PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration
Ahmed F. AbouElhamayed
Angela Cui
Javier Fernandez-Marques
Nicholas D. Lane
Mohamed S. Abdelfattah
MQ
23
4
0
25 May 2023
PDP: Parameter-free Differentiable Pruning is All You Need
PDP: Parameter-free Differentiable Pruning is All You Need
Minsik Cho
Saurabh N. Adya
Devang Naik
VLM
15
10
0
18 May 2023
Patch-wise Mixed-Precision Quantization of Vision Transformer
Patch-wise Mixed-Precision Quantization of Vision Transformer
Junrui Xiao
Zhikai Li
Lianwei Yang
Qingyi Gu
MQ
27
12
0
11 May 2023
LayerNAS: Neural Architecture Search in Polynomial Complexity
LayerNAS: Neural Architecture Search in Polynomial Complexity
Yicheng Fan
Dana Alon
Jingyue Shen
Daiyi Peng
Keshav Kumar
Yun Long
Xin Wang
Fotis Iliopoulos
Da-Cheng Juan
Erik Vee
23
2
0
23 Apr 2023
QuMoS: A Framework for Preserving Security of Quantum Machine Learning
  Model
QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model
Zhepeng Wang
Jinyang Li
Zhirui Hu
Blake Gage
Elizabeth Iwasawa
Weiwen Jiang
25
9
0
23 Apr 2023
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Evil from Within: Machine Learning Backdoors through Hardware Trojans
Alexander Warnecke
Julian Speith
Janka Möller
Konrad Rieck
C. Paar
AAML
11
3
0
17 Apr 2023
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Canvas: End-to-End Kernel Architecture Search in Neural Networks
Chenggang Zhao
Genghan Zhang
Mingyu Gao
13
1
0
16 Apr 2023
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs
  and ASICs
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
Javier Campos
Zhen Dong
Javier Mauricio Duarte
A. Gholami
Michael W. Mahoney
Jovan Mitrevski
Nhan Tran
MQ
24
3
0
13 Apr 2023
Learning Accurate Performance Predictors for Ultrafast Automated Model
  Compression
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression
Ziwei Wang
Jiwen Lu
Han Xiao
Shengyu Liu
Jie Zhou
OffRL
23
1
0
13 Apr 2023
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural
  Networks
AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
Cheng Gong
Ye Lu
Surong Dai
Deng Qian
Chenkun Du
Tao Li
MQ
27
0
0
07 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
40
0
07 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution
  Vision Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
23
46
0
30 Mar 2023
Solving Oscillation Problem in Post-Training Quantization Through a
  Theoretical Perspective
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Fei Chao
Rongrong Ji
MQ
10
12
0
21 Mar 2023
Gated Compression Layers for Efficient Always-On Models
Gated Compression Layers for Efficient Always-On Models
Haiguang Li
T. Thormundsson
I. Poupyrev
N. Gillian
36
2
0
15 Mar 2023
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8
  Inference
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference
Li Lyna Zhang
Xudong Wang
Jiahang Xu
Quanlu Zhang
Yujing Wang
Yuqing Yang
Ningxin Zheng
Ting Cao
Mao Yang
MQ
30
2
0
15 Mar 2023
R2 Loss: Range Restriction Loss for Model Compression and Quantization
R2 Loss: Range Restriction Loss for Model Compression and Quantization
Arnav Kundu
Chungkuk Yoo
Srijan Mishra
Minsik Cho
Saurabh N. Adya
MQ
30
1
0
14 Mar 2023
MetaMixer: A Regularization Strategy for Online Knowledge Distillation
MetaMixer: A Regularization Strategy for Online Knowledge Distillation
Maorong Wang
L. Xiao
T. Yamasaki
KELM
MoE
24
1
0
14 Mar 2023
AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse
  Edge Environments
AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments
Hao Wen
Yuanchun Li
Zunshuai Zhang
Shiqi Jiang
Xiaozhou Ye
Ouyang Ye
Yaqin Zhang
Yunxin Liu
87
29
0
13 Mar 2023
Bag of Tricks with Quantized Convolutional Neural Networks for image
  classification
Bag of Tricks with Quantized Convolutional Neural Networks for image classification
Jie Hu
Mengze Zeng
Enhua Wu
MQ
21
2
0
13 Mar 2023
TinyAD: Memory-efficient anomaly detection for time series data in
  Industrial IoT
TinyAD: Memory-efficient anomaly detection for time series data in Industrial IoT
Yuting Sun
Tong Chen
Quoc Viet Hung Nguyen
Hongzhi Yin
21
12
0
07 Mar 2023
Rotation Invariant Quantization for Model Compression
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
16
0
0
03 Mar 2023
Structured Pruning for Deep Convolutional Neural Networks: A survey
Structured Pruning for Deep Convolutional Neural Networks: A survey
Yang He
Lingao Xiao
3DPC
30
116
0
01 Mar 2023
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural
  Network Inference
DyBit: Dynamic Bit-Precision Numbers for Efficient Quantized Neural Network Inference
Jiajun Zhou
Jiajun Wu
Yizhao Gao
Yuhao Ding
Chaofan Tao
Bo-wen Li
Fengbin Tu
Kwang-Ting Cheng
Hayden Kwok-Hay So
Ngai Wong
MQ
16
7
0
24 Feb 2023
Towards Optimal Compression: Joint Pruning and Quantization
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
17
2
0
15 Feb 2023
SEAM: Searching Transferable Mixed-Precision Quantization Policy through
  Large Margin Regularization
SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization
Chen Tang
Kai Ouyang
Zenghao Chai
Yunpeng Bai
Yuan Meng
Zhi Wang
Wenwu Zhu
MQ
32
9
0
14 Feb 2023
A Practical Mixed Precision Algorithm for Post-Training Quantization
A Practical Mixed Precision Algorithm for Post-Training Quantization
N. Pandey
Markus Nagel
M. V. Baalen
Yin-Ruey Huang
Chirag I. Patel
Tijmen Blankevoort
MQ
9
18
0
10 Feb 2023
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement
  Learning
Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
Yingchun Wang
Jingcai Guo
Song Guo
Weizhan Zhang
MQ
29
20
0
09 Feb 2023
DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a
  Multi-Tasking System
DynaMIX: Resource Optimization for DNN-Based Real-Time Applications on a Multi-Tasking System
Minkyoung Cho
K. Shin
27
2
0
03 Feb 2023
Mixed Precision Post Training Quantization of Neural Networks with
  Sensitivity Guided Search
Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search
Clemens J. S. Schaefer
Elfie Guo
Caitlin Stanton
Xiaofan Zhang
T. Jablin
Navid Lambert-Shirzad
Jian Li
Chia-Wei Chou
Siddharth Joshi
Yu Wang
MQ
21
3
0
02 Feb 2023
$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks
A2Q\rm A^2QA2Q: Aggregation-Aware Quantization for Graph Neural Networks
Zeyu Zhu
Fanrong Li
Zitao Mo
Qinghao Hu
Gang Li
Zejian Liu
Xiaoyao Liang
Jian Cheng
GNN
MQ
24
4
0
01 Feb 2023
Efficient and Effective Methods for Mixed Precision Neural Network
  Quantization for Faster, Energy-efficient Inference
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
20
4
0
30 Jan 2023
Does Federated Learning Really Need Backpropagation?
Does Federated Learning Really Need Backpropagation?
H. Feng
Tianyu Pang
Chao Du
Wei-Neng Chen
Shuicheng Yan
Min-Bin Lin
FedML
26
10
0
28 Jan 2023
Previous
123456789
Next