ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.08886
  4. Cited By
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
v1v2v3 (latest)

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Computer Vision and Pattern Recognition (CVPR), 2018
21 November 2018
Kuan-Chieh Wang
Zhijian Liu
Chengyue Wu
Ji Lin
Song Han
    MQ
ArXiv (abs)PDFHTML

Papers citing "HAQ: Hardware-Aware Automated Quantization with Mixed Precision"

50 / 462 papers shown
Title
Adaptive Mesh-Quantization for Neural PDE Solvers
Adaptive Mesh-Quantization for Neural PDE Solvers
Winfried van den Dool
Maksim Zhdanov
Yuki M. Asano
Max Welling
MQAI4CE
269
0
0
23 Nov 2025
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference
Kexin Chu
Dawei Xiang
Zixu Shen
Yiwei Yang
Zecheng Liu
Wei Zhang
MoEMQ
267
0
0
19 Nov 2025
MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
MCAQ-YOLO: Morphological Complexity-Aware Quantization for Efficient Object Detection with Curriculum Learning
Yoonjae Seo
Ermal Elbasani
Jaehong Lee
MQ
100
0
0
17 Nov 2025
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
Youneng Bao
Yulong Cheng
Y. Liu
Yichen Yang
Peng Qin
Mu Li
Yongsheng Liang
MQ
54
0
0
11 Nov 2025
Pruning and Quantization Impact on Graph Neural Networks
Pruning and Quantization Impact on Graph Neural Networks
Khatoon Khedri
Reza Rawassizadeh
Qifu Wen
M. Hosseinzadeh
GNN
142
0
0
24 Oct 2025
Quantization Range Estimation for Convolutional Neural Networks
Quantization Range Estimation for Convolutional Neural Networks
Bingtao Yang
Yujia Wang
Mengzhi Jiao
Hongwei Huo
MQ
111
0
0
05 Oct 2025
CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
Tianqi Liu
Kairui Fu
Shengyu Zhang
W. Fan
Zhaocheng Du
Jieming Zhu
Fan Wu
Fei Wu
MQ
64
0
0
03 Oct 2025
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
FlexiQ: Adaptive Mixed-Precision Quantization for Latency/Accuracy Trade-Offs in Deep Neural Networks
Jaemin Kim
Hongjun Um
Sungkyun Kim
Yongjun Park
Jiwon Seo
MQ
105
0
0
03 Oct 2025
Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
Logan Frank
Paul Ardis
AAML
60
0
0
02 Oct 2025
Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation
Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation
Ali Zoljodi
Radu Timofte
Masoud Daneshtalab
MQ
103
0
0
30 Sep 2025
Embodied AI: From LLMs to World Models
Embodied AI: From LLMs to World Models
Tongtong Feng
Xin Wang
Yu Jiang
Wenwu Zhu
LM&Ro
273
7
0
24 Sep 2025
QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception
QuantV2X: A Fully Quantized Multi-Agent System for Cooperative Perception
Seth Z. Zhao
Huizhi Zhang
Zhaowei Li
Juntong Peng
Anthony Chui
...
Fujia Wang
Ran Tian
Chenfeng Xu
Bolei Zhou
Jiaqi Ma
72
1
0
03 Sep 2025
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
Yubo Gao
Renbo Tu
Gennady Pekhimenko
Nandita Vijaykumar
MQ
104
0
0
03 Sep 2025
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM AccelerationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025
Shaobo Ma
Chao Fang
Haikuo Shao
Zhongfeng Wang
56
0
0
26 Aug 2025
Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers
Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers
Lucas Maisonnave
Karim Haroun
Tom Pegeot
MQ
75
0
0
22 Aug 2025
Quantized Neural Networks for Microcontrollers: A Comprehensive Review of Methods, Platforms, and Applications
Quantized Neural Networks for Microcontrollers: A Comprehensive Review of Methods, Platforms, and Applications
Hamza A. Abushahla
Dara Varam
Ariel J. N. Panopio
Mohamed I. AlHajri
MQ
271
0
0
20 Aug 2025
eMamba: Efficient Acceleration Framework for Mamba Models in Edge Computing
eMamba: Efficient Acceleration Framework for Mamba Models in Edge ComputingACM Transactions on Embedded Computing Systems (ACM TECS), 2025
Jiyong Kim
J. Lee
Jiahao Lin
Alish Kanani
Miao Sun
Ümit Y. Ogras
Jaehyun Park
Mamba
142
1
0
14 Aug 2025
MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI
MiCo: End-to-End Mixed Precision Neural Network Co-Exploration Framework for Edge AI
Zijun Jiang
Yangdi Lyu
MQ
67
0
0
13 Aug 2025
DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic
Hazem Hesham Yousef Shalby
Fabrizio Pittorino
Francesca Palermo
Diana Trojaniello
Manuel Roveri
MQ
80
0
0
07 Aug 2025
InfoQ: Mixed-Precision Quantization via Global Information Flow
InfoQ: Mixed-Precision Quantization via Global Information Flow
Mehmet Emre Akbulut
Hazem Hesham Yousef Shalby
Fabrizio Pittorino
Manuel Roveri
MQ
40
0
0
06 Aug 2025
Where and How to Enhance: Discovering Bit-Width Contribution for Mixed Precision Quantization
Where and How to Enhance: Discovering Bit-Width Contribution for Mixed Precision QuantizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Haidong Kang
Lianbo Ma
Guo-Ding Yu
Shangce Gao
MQ
152
1
0
05 Aug 2025
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
Zuxin Ma
Yunhe Cui
Yongbin Qin
72
0
0
04 Aug 2025
MSQ: Memory-Efficient Bit Sparsification Quantization
MSQ: Memory-Efficient Bit Sparsification Quantization
Seokho Han
Seoyeon Yoon
Jinhee Kim
Dongwei Wang
Kang Eun Jeon
Huanrui Yang
Jong Hwan Ko
MQ
108
0
0
30 Jul 2025
Text Embedding Knows How to Quantize Text-Guided Diffusion Models
Text Embedding Knows How to Quantize Text-Guided Diffusion Models
H. Lee
Myungjun Son
Dongjea Kang
Seung-Won Jung
DiffMMQ
163
1
0
14 Jul 2025
DANCE: Resource-Efficient Neural Architecture Search with Data-Aware and Continuous Adaptation
DANCE: Resource-Efficient Neural Architecture Search with Data-Aware and Continuous Adaptation
Xinjian Zhao
Tianshuo Wei
Sheng Zhang
Ruocheng Guo
Wanyu Wang
Shanshan Ye
Lixin Zou
Xuetao Wei
Xiangyu Zhao
TTA
209
1
0
07 Jul 2025
Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation
Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation
Xingting Yao
Qinghao Hu
Fei Zhou
Tielong Liu
Gang Li
Peisong Wang
Jian Cheng
MQ
146
0
0
30 Jun 2025
Compression Aware Certified Training
Compression Aware Certified Training
Changming Xu
Gagandeep Singh
122
0
0
13 Jun 2025
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Jung Hyun Lee
Seungjae Shin
Vinnam Kim
Jaeseong You
An Chen
MQ
133
2
0
10 Jun 2025
Towards a Small Language Model Lifecycle Framework
Towards a Small Language Model Lifecycle Framework
Parsa Miraghaei
Sergio Moreschini
Antti Kolehmainen
David Hästbacka
119
0
0
09 Jun 2025
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
Xiuhong Li
Zhihang Yuan
Size Zheng
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQMoE
850
7
0
09 May 2025
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Lianbo Ma
Jianlun Ma
Yuee Zhou
Guoyang Xie
Qiang He
Zhichao Lu
MQ
262
1
0
08 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQVLM
302
0
0
08 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
256
1
0
05 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression TechniquesAnnual International Computer Software and Applications Conference (COMPSAC), 2025
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
282
7
0
05 May 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
354
1
0
23 Apr 2025
FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference
FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference
Coleman Hooper
Charbel Sakr
Ben Keller
Rangharajan Venkatesan
Kurt Keutzer
Siyang Song
Brucek Khailany
MQ
237
1
0
19 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
295
2
0
17 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
223
1
0
13 Apr 2025
Generative Artificial Intelligence for Internet of Things Computing: A Systematic Survey
Generative Artificial Intelligence for Internet of Things Computing: A Systematic Survey
Fabrizio Mangione
Claudio Savaglio
Giancarlo Fortino
183
5
0
10 Apr 2025
The Neural Pruning Law Hypothesis
The Neural Pruning Law Hypothesis
Eugen Barbulescu
Antonio Alexoaie
Lucian Busoniu
275
0
0
06 Apr 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Hui Yuan
Guang Dai
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
261
1
0
31 Mar 2025
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models
Hung-Yueh Chiang
Chi-chih Chang
N. Frumkin
Kai-Chiang Wu
Mohamed S. Abdelfattah
Diana Marculescu
MQ
972
2
0
28 Mar 2025
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness
MoQa: Rethinking MoE Quantization with Multi-stage Data-model Distribution Awareness
Zihao Zheng
Xiuping Cui
Size Zheng
Maoliang Li
Jiayu Chen
Yun Liang
Xiang Chen
MQMoE
232
1
0
27 Mar 2025
Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
Mixed precision accumulation for neural network inference guided by componentwise forward error analysis
El-Mehdi El Arar
Silviu-Ioan Filip
Theo Mary
Elisa Riccietti
169
0
0
19 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
294
0
0
12 Mar 2025
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI ModelsACM Computing Surveys (ACM Comput. Surv.), 2025
Xubin Wang
Zhiqing Tang
Jianxiong Guo
Tianhui Meng
Chenhao Wang
Tian-sheng Wang
Weijia Jia
334
49
0
08 Mar 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
Jiangming Wang
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQMoMe
163
3
0
07 Mar 2025
KVCrush: Key value cache size-reduction using similarity in head-behaviour
Gopi Krishna Jha
Sameh Gobriel
Liubov Talamanova
Alexander Kozlov
Nilesh Jain
MQ
166
0
0
24 Feb 2025
Optimizing DNN Inference on Multi-Accelerator SoCs at Training-time
Optimizing DNN Inference on Multi-Accelerator SoCs at Training-timeIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2024
Matteo Risso
Luca Bompani
Daniele Jahier Pagliari
278
2
0
24 Feb 2025
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
Yunquan Zhang
Daning Cheng
Yunquan Zhang
Meiqi Tu
Fangmin Liu
Jiake Tian
170
2
0
19 Feb 2025
1234...8910
Next