Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2001.00281
Cited By
ZeroQ: A Novel Zero Shot Quantization Framework
1 January 2020
Yaohui Cai
Z. Yao
Zhen Dong
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ZeroQ: A Novel Zero Shot Quantization Framework"
50 / 69 papers shown
Title
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
Yeona Hong
Hyewon Han
Woo-Jin Chung
Hong-Goo Kang
MQ
28
0
0
21 Apr 2025
GranQ: Granular Zero-Shot Quantization with Unified Layer-Channel Awareness
Inpyo Hong
Youngwan Jo
Hyojeong Lee
Sunghyun Ahn
Sanghyun Park
MQ
60
0
0
24 Mar 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
34
0
0
01 Nov 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
110
0
0
29 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
99
0
0
22 Oct 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Muyang Li
Ligeng Zhu
Y. Lu
Song Han
VLM
31
49
0
14 Oct 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
32
1
0
10 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
26
1
0
08 Oct 2024
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
Kanghyun Choi
Hyeyoon Lee
Dain Kwon
Sunjong Park
Kyuyeun Kim
Noseong Park
Jinho Lee
Jinho Lee
MQ
40
1
0
29 Jul 2024
MetaAug: Meta-Data Augmentation for Post-Training Quantization
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
Dinh Q. Phung
Gustavo Carneiro
Thanh-Toan Do
MQ
38
0
0
20 Jul 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQ
VGen
97
23
0
04 Jun 2024
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng R. Li
Mohamed S. Abdelfattah
Zhiru Zhang
24
7
0
06 May 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
32
3
0
25 Mar 2024
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
27
12
0
13 Dec 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng R. Li
MQ
25
3
0
07 Aug 2023
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara
Sankalp Dayal
Tarqi Afzal
Rahul Bakshi
Kahkuen Fu
MQ
15
0
0
01 Aug 2023
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Jun Chen
Shipeng Bai
Tianxin Huang
Mengmeng Wang
Guanzhong Tian
Y. Liu
MQ
34
18
0
02 Jul 2023
Q-YOLO: Efficient Inference for Real-time Object Detection
Mingze Wang
H. Sun
Jun Shi
Xuhui Liu
Baochang Zhang
Xianbin Cao
ObjD
28
8
0
01 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
13
88
0
22 Jun 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
42
187
0
29 May 2023
PTQD: Accurate Post-Training Quantization for Diffusion Models
Yefei He
Luping Liu
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
DiffM
MQ
22
100
0
18 May 2023
Diversifying the High-level Features for better Adversarial Transferability
Zhiyuan Wang
Zeliang Zhang
Siyuan Liang
Xiaosen Wang
AAML
37
18
0
20 Apr 2023
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs
Javier Campos
Zhen Dong
Javier Mauricio Duarte
A. Gholami
Michael W. Mahoney
Jovan Mitrevski
Nhan Tran
MQ
24
3
0
13 Apr 2023
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation
Bo-Kyeong Kim
Jaemin Kang
Daeun Seo
Hancheol Park
Shinkook Choi
Hyoung-Kyu Song
Hyungshin Kim
Sungsu Lim
19
0
0
02 Apr 2023
Hard Sample Matters a Lot in Zero-Shot Quantization
Huantong Li
Xiangmiao Wu
Fanbing Lv
Daihai Liao
Thomas H. Li
Yonggang Zhang
Bo Han
Mingkui Tan
MQ
24
19
0
24 Mar 2023
Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective
Yuexiao Ma
Huixia Li
Xiawu Zheng
Xuefeng Xiao
Rui Wang
Shilei Wen
Xin Pan
Fei Chao
Rongrong Ji
MQ
10
11
0
21 Mar 2023
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
10
0
0
03 Mar 2023
Rethinking Data-Free Quantization as a Zero-Sum Game
Biao Qian
Yang Wang
Richang Hong
Meng Wang
MQ
11
17
0
19 Feb 2023
Q-Diffusion: Quantizing Diffusion Models
Xiuyu Li
Yijia Liu
Long Lian
Hua Yang
Zhen Dong
Daniel Kang
Shanghang Zhang
Kurt Keutzer
DiffM
MQ
34
152
0
08 Feb 2023
ACQ: Improving Generative Data-free Quantization Via Attention Correction
Jixing Li
Xiaozhou Guo
Benzhe Dai
Guoliang Gong
Min Jin
Gang Chen
Wenyu Mao
Huaxiang Lu
MQ
30
4
0
18 Jan 2023
Hyperspherical Quantization: Toward Smaller and More Accurate Models
Dan Liu
X. Chen
Chen-li Ma
Xue Liu
MQ
18
3
0
24 Dec 2022
CSMPQ:Class Separability Based Mixed-Precision Quantization
Ming-Yu Wang
Taisong Jin
Miaohui Zhang
Zhengtao Yu
MQ
15
0
0
20 Dec 2022
Towards Hardware-Specific Automatic Compression of Neural Networks
Torben Krieger
Bernhard Klein
Holger Fröning
MQ
19
2
0
15 Dec 2022
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
96
67
0
14 Dec 2022
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification
Lirui Xiao
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
19
10
0
06 Dec 2022
Post-training Quantization on Diffusion Models
Yuzhang Shang
Zhihang Yuan
Bin Xie
Bingzhe Wu
Yan Yan
DiffM
MQ
15
156
0
28 Nov 2022
Long-Range Zero-Shot Generative Deep Network Quantization
Yan Luo
Yangcheng Gao
Zhao Zhang
Haijun Zhang
Mingliang Xu
Meng Wang
MQ
23
9
0
13 Nov 2022
Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report
Andrey D. Ignatov
Radu Timofte
Maurizio Denna
Abdelbadie Younes
Ganzorig Gankhuyag
...
Jing Liu
Garas Gendy
Nabil Sabor
J. Hou
Guanghui He
SupR
MQ
18
31
0
07 Nov 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
22
145
0
27 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViT
MQ
43
32
0
13 Sep 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
25
3
0
25 Aug 2022
SVD-NAS: Coupling Low-Rank Approximation and Neural Architecture Search
Zhewen Yu
C. Bouganis
8
4
0
22 Aug 2022
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
8
78
0
19 Aug 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
18
11
0
11 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
14
1
0
31 Jul 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLM
MQ
39
438
0
04 Jun 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
Peng Hu
Xi Peng
Hongyuan Zhu
M. Aly
Jie Lin
MQ
31
59
0
23 May 2022
RAPQ: Rescuing Accuracy for Power-of-Two Low-bit Post-training Quantization
Hongyi Yao
Pu Li
Jian Cao
Xiangcheng Liu
Chenying Xie
Bin Wang
MQ
11
12
0
26 Apr 2022
Intelligence at the Extreme Edge: A Survey on Reformable TinyML
Visal Rajapakse
Ishan Karunanayake
Nadeem Ahmed
SyDa
23
53
0
02 Apr 2022
1
2
Next