ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Computer Vision and Pattern Recognition (CVPR), 2018
21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
    MQ
ArXiv (abs)PDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 464 papers shown
Title
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
Yunquan Zhang
Daning Cheng
Yunquan Zhang
Meiqi Tu
Fangmin Liu
Jiake Tian
186
2
0
24 Dec 2025
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
Wenhua Cheng
Weiwei Zhang
Heng Guo
Haihao Shen
MQ
70
0
0
04 Dec 2025
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
TWEO: Transformers Without Extreme Outliers Enables FP8 Training And Quantization For Dummies
Guang Liang
Jie Shao
Ningyuan Tang
Xinyao Liu
Jianxin Wu
MQ
165
0
0
28 Nov 2025
Adaptive Mesh-Quantization for Neural PDE Solvers
Adaptive Mesh-Quantization for Neural PDE Solvers
Winfried van den Dool
Maksim Zhdanov
Yuki M. Asano
Max Welling
MQAI4CE
367
0
0
23 Nov 2025
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
Kexin Chu
Dawei Xiang
Zixu Shen
Yiwei Yang
Zecheng Liu
Wei Zhang
MoEMQ
407
1
0
19 Nov 2025
MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
Yoonjae Seo
Ermal Elbasani
Jaehong Lee
MQ
240
0
0
17 Nov 2025
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
Youneng Bao
Yulong Cheng
Y. Liu
Yichen Yang
Peng Qin
Mu Li
Yongsheng Liang
MQ
90
0
0
11 Nov 2025
Pruning and Quantization Impact on Graph Neural Networks
Pruning and Quantization Impact on Graph Neural Networks
Khatoon Khedri
Reza Rawassizadeh
Qifu Wen
M. Hosseinzadeh
GNN
190
0
0
24 Oct 2025
Quantization Range Estimation for Convolutional Neural Networks
Quantization Range Estimation for Convolutional Neural Networks
Bingtao Yang
Yujia Wang
Mengzhi Jiao
Hongwei Huo
MQ
131
0
0
05 Oct 2025
CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
Tianqi Liu
Kairui Fu
Shengyu Zhang
W. Fan
Zhaocheng Du
Jieming Zhu
Fan Wu
Fei Wu
MQ
108
0
0
03 Oct 2025
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
Jaemin Kim
Hongjun Um
Sungkyun Kim
Yongjun Park
Jiwon Seo
MQ
121
0
0
03 Oct 2025
Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
Logan Frank
Paul Ardis
AAML
100
0
0
02 Oct 2025
Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation
Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation
Ali Zoljodi
Radu Timofte
Masoud Daneshtalab
MQ
143
0
0
30 Sep 2025
Embodied AI: From LLMs to World Models
Embodied AI: From LLMs to World Models
Tongtong Feng
Xin Wang
Yu Jiang
Wenwu Zhu
LM&Ro
325
9
0
24 Sep 2025
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
Yubo Gao
Renbo Tu
Gennady Pekhimenko
Nandita Vijaykumar
MQ
144
0
0
03 Sep 2025
QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception
QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception
Seth Z. Zhao
Huizhi Zhang
Zhaowei Li
Juntong Peng
Anthony Chui
...
Fujia Wang
Ran Tian
Chenfeng Xu
Bolei Zhou
Jiaqi Ma
116
1
0
03 Sep 2025
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM AccelerationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025
Shaobo Ma
Chao Fang
Haikuo Shao
Zhongfeng Wang
108
0
0
26 Aug 2025
Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers
Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers
Lucas Maisonnave
Karim Haroun
Tom Pegeot
MQ
95
0
0
22 Aug 2025
Neural Network Quantization for Microcontrollers: A Comprehensive Survey of Methods, Platforms, and Applications
Neural Network Quantization for Microcontrollers: A Comprehensive Survey of Methods, Platforms, and Applications
Hamza A. Abushahla
Dara Varam
Ariel J. N. Panopio
Mohamed I. AlHajri
MQ
352
1
0
20 Aug 2025
eMamba: Efficient Acceleration Framework for Mamba Models in Edge Computing
eMamba: Efficient Acceleration Framework for Mamba Models in Edge ComputingACM Transactions on Embedded Computing Systems (ACM TECS), 2025
Jiyong Kim
J. Lee
Jiahao Lin
Alish Kanani
Miao Sun
Ümit Y. Ogras
Jaehyun Park
Mamba
166
1
0
14 Aug 2025
MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI
MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI
Zijun Jiang
Yangdi Lyu
MQ
87
0
0
13 Aug 2025
DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
Hazem Hesham Yousef Shalby
Fabrizio Pittorino
Francesca Palermo
Diana Trojaniello
Manuel Roveri
MQ
88
0
0
07 Aug 2025
InfoQ: Mixed-Precision Quantization via Global Information Flow
InfoQ: Mixed-Precision Quantization via Global Information Flow
Mehmet Emre Akbulut
Hazem Hesham Yousef Shalby
Fabrizio Pittorino
Manuel Roveri
MQ
64
0
0
06 Aug 2025
Where and How to Enhance: Discovering Bit-Width Contribution for Mixed Precision Quantization
Where and How to Enhance: Discovering Bit-Width Contribution for Mixed Precision QuantizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Haidong Kang
Lianbo Ma
Guo-Ding Yu
Shangce Gao
MQ
228
1
0
05 Aug 2025
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
Zuxin Ma
Yunhe Cui
Yongbin Qin
116
0
0
04 Aug 2025
MSQ: Memory-Efficient Bit Sparsification Quantization
MSQ: Memory-Efficient Bit Sparsification Quantization
Seokho Han
Seoyeon Yoon
Jinhee Kim
Dongwei Wang
Kang Eun Jeon
Huanrui Yang
Jong Hwan Ko
MQ
132
0
0
30 Jul 2025
Text Embedding Knows How to Quantize Text-Guided Diffusion Models
Text Embedding Knows How to Quantize Text-Guided Diffusion Models
H. Lee
Myungjun Son
Dongjea Kang
Seung-Won Jung
DiffMMQ
243
1
0
14 Jul 2025
DANCE: Resource-Efficient Neural Architecture Search with Data-Aware and Continuous Adaptation
DANCE: Resource-Efficient Neural Architecture Search with Data-Aware and Continuous Adaptation
Xinjian Zhao
Tianshuo Wei
Sheng Zhang
Ruocheng Guo
Wanyu Wang
Shanshan Ye
Lixin Zou
Xuetao Wei
Xiangyu Zhao
TTA
273
2
0
07 Jul 2025
Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation
Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit AllocationNeural Networks (NN), 2025
Xingting Yao
Qinghao Hu
Fei Zhou
Tielong Liu
Gang Li
Peisong Wang
Jian Cheng
MQ
174
0
0
30 Jun 2025
Compression Aware Certified Training
Compression Aware Certified Training
Changming Xu
Gagandeep Singh
142
0
0
13 Jun 2025
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Jung Hyun Lee
Seungjae Shin
Vinnam Kim
Jaeseong You
An Chen
MQ
189
2
0
10 Jun 2025
Towards a Small Language Model Lifecycle Framework
Towards a Small Language Model Lifecycle Framework
Parsa Miraghaei
Sergio Moreschini
Antti Kolehmainen
David Hästbacka
151
0
0
09 Jun 2025
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
Xiuhong Li
Zhihang Yuan
Size Zheng
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQMoE
918
8
0
09 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQVLM
338
0
0
08 May 2025
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Lianbo Ma
Jianlun Ma
Yuee Zhou
Guoyang Xie
Qiang He
Zhichao Lu
MQ
294
2
0
08 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
280
1
0
05 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression TechniquesAnnual International Computer Software and Applications Conference (COMPSAC), 2025
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
338
8
0
05 May 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
394
1
0
23 Apr 2025
FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference
FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference
Coleman Hooper
Charbel Sakr
Ben Keller
Rangharajan Venkatesan
Kurt Keutzer
Siyang Song
Brucek Khailany
MQ
265
1
0
19 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
327
2
0
17 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
255
1
0
13 Apr 2025
Generative Artificial Intelligence for Internet of Things Computing: A Systematic Survey
Generative Artificial Intelligence for Internet of Things Computing: A Systematic Survey
Fabrizio Mangione
Claudio Savaglio
Giancarlo Fortino
243
5
0
10 Apr 2025
The Neural Pruning Law Hypothesis
The Neural Pruning Law Hypothesis
Eugen Barbulescu
Antonio Alexoaie
Lucian Busoniu
351
0
0
06 Apr 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Hui Yuan
Guang Dai
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
317
1
0
31 Mar 2025
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models
Hung-Yueh Chiang
Chi-chih Chang
N. Frumkin
Kai-Chiang Wu
Mohamed S. Abdelfattah
Diana Marculescu
MQ
1.1K
2
0
28 Mar 2025
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness
Zihao Zheng
Xiuping Cui
Size Zheng
Maoliang Li
Jiayu Chen
Yun Liang
Xiang Chen
MQMoE
276
1
0
27 Mar 2025
Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
El-Mehdi El Arar
Silviu-Ioan Filip
Theo Mary
Elisa Riccietti
205
0
0
19 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
370
0
0
12 Mar 2025
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI ModelsACM Computing Surveys (ACM Comput. Surv.), 2025
Xubin Wang
Zhiqing Tang
Jianxiong Guo
Tianhui Meng
Chenhao Wang
Tian-sheng Wang
Weijia Jia
366
62
0
08 Mar 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
Jiangming Wang
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQMoMe
195
3
0
07 Mar 2025
1234...8910
Next