ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.00281
  4. Cited By
ZeroQ: A Novel Zero Shot Quantization Framework

ZeroQ: A Novel Zero Shot Quantization Framework

Computer Vision and Pattern Recognition (CVPR), 2020
1 January 2020
Yaohui Cai
Z. Yao
Zhen Dong
A. Gholami
Michael W. Mahoney
Kurt Keutzer
    MQ
ArXiv (abs)PDFHTMLGithub (279★)

Papers citing "ZeroQ: A Novel Zero Shot Quantization Framework"

50 / 249 papers shown
Title
An Analysis on Quantizing Diffusion Transformers
An Analysis on Quantizing Diffusion Transformers
Yuewei Yang
Jialiang Wang
Xiaoliang Dai
Peizhao Zhang
Hongbo Zhang
MQ
230
2
0
16 Jun 2024
Low-Rank Quantization-Aware Training for LLMs
Low-Rank Quantization-Aware Training for LLMs
Yelysei Bondarenko
Riccardo Del Chiaro
Markus Nagel
MQ
264
36
0
10 Jun 2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao
Tongcheng Fang
Haofeng Huang
Enshu Liu
Widyadewi Soedarmadji
...
Shengen Yan
Huazhong Yang
Xuefei Ning
Xuefei Ning
Yu Wang
MQVGen
400
60
0
04 Jun 2024
Robust Knowledge Distillation Based on Feature Variance Against
  Backdoored Teacher Model
Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model
Jinyin Chen
Xiaoming Zhao
Haibin Zheng
Xiao Li
Sheng Xiang
Haifeng Guo
AAML
124
7
0
01 Jun 2024
LLMC: Benchmarking Large Language Model Quantization with a Versatile
  Compression Toolkit
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression ToolkitConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yazhe Niu
Yang Yong
Shiqiao Gu
Yushi Huang
Chentao Lv
Yunchen Zhang
Xianglong Liu
Dacheng Tao
MQ
305
22
0
09 May 2024
Learning from Students: Applying t-Distributions to Explore Accurate and
  Efficient Formats for LLMs
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMsInternational Conference on Machine Learning (ICML), 2024
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng Li
Mohamed S. Abdelfattah
Zhiru Zhang
255
16
0
06 May 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
279
138
0
08 Apr 2024
DNN Memory Footprint Reduction via Post-Training Intra-Layer
  Multi-Precision Quantization
DNN Memory Footprint Reduction via Post-Training Intra-Layer Multi-Precision QuantizationIEEE International Symposium on Quality Electronic Design (ISQED), 2024
B. Ghavami
Amin Kamjoo
Lesley Shannon
S. Wilton
MQ
138
0
0
03 Apr 2024
QNCD: Quantization Noise Correction for Diffusion Models
QNCD: Quantization Noise Correction for Diffusion Models
Huanpeng Chu
Wei Wu
Chengjie Zang
Kun Yuan
DiffMMQ
321
13
0
28 Mar 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
339
5
0
25 Mar 2024
AffineQuant: Affine Transformation Quantization for Large Language
  Models
AffineQuant: Affine Transformation Quantization for Large Language Models
Yuexiao Ma
Huixia Li
Xiawu Zheng
Feng Ling
Xuefeng Xiao
Rui Wang
Shilei Wen
Jiayi Ji
Rongrong Ji
MQ
212
42
0
19 Mar 2024
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient
  Task Adaptation
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task AdaptationEuropean Conference on Computer Vision (ECCV), 2024
Yizhe Xiong
Hui Chen
Tianxiang Hao
Zijia Lin
Jungong Han
Yuesong Zhang
Guoxin Wang
Yongjun Bao
Guiguang Ding
234
24
0
14 Mar 2024
COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization
COMQ: A Backpropagation-Free Algorithm for Post-Training QuantizationIEEE Access (IEEE Access), 2024
Aozhong Zhang
Zi Yang
Naigang Wang
Yingyong Qin
Jack Xin
Xin Li
Penghang Yin
VLMMQ
142
11
0
11 Mar 2024
Self-Adapting Large Visual-Language Models to Edge Devices across Visual
  Modalities
Self-Adapting Large Visual-Language Models to Edge Devices across Visual ModalitiesEuropean Conference on Computer Vision (ECCV), 2024
Kaiwen Cai
Zhekai Duan
Gaowen Liu
Charles Fleming
Chris Xiaoxuan Lu
VLM
193
8
0
07 Mar 2024
Ef-QuantFace: Streamlined Face Recognition with Small Data and Low-Bit
  Precision
Ef-QuantFace: Streamlined Face Recognition with Small Data and Low-Bit Precision
William Gazali
Jocelyn Michelle Kho
Joshua Santoso
Williem
CVBMMQ
177
0
0
28 Feb 2024
Understanding Neural Network Binarization with Forward and Backward
  Proximal Quantizers
Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers
Yiwei Lu
Yaoliang Yu
Xinlin Li
Vahid Partovi Nia
MQ
177
5
0
27 Feb 2024
Outlier-Aware Training for Low-Bit Quantization of Structural
  Re-Parameterized Networks
Outlier-Aware Training for Low-Bit Quantization of Structural Re-Parameterized Networks
Muqun Niu
Yuan Ren
Boyu Li
Chenchen Ding
BDLMQ
172
0
0
11 Feb 2024
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric
  Strategy for Diverse Generative Tasks
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Xiaoxia Wu
Haojun Xia
Stephen Youn
Zhen Zheng
Shiyang Chen
...
Reza Yazdani Aminabadi
Yuxiong He
Olatunji Ruwase
Leon Song
Zhewei Yao
234
15
0
14 Dec 2023
CBQ: Cross-Block Quantization for Large Language Models
CBQ: Cross-Block Quantization for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
689
27
0
13 Dec 2023
Efficient Quantization Strategies for Latent Diffusion Models
Efficient Quantization Strategies for Latent Diffusion Models
Yuewei Yang
Xiaoliang Dai
Jialiang Wang
Peizhao Zhang
Hongbo Zhang
DiffMMQ
238
15
0
09 Dec 2023
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
GenQ: Quantization in Low Data Regimes with Generative Synthetic DataEuropean Conference on Computer Vision (ECCV), 2023
Yuhang Li
Youngeun Kim
Donghyun Lee
Souvik Kundu
Priyadarshini Panda
MQ
281
6
0
07 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
328
32
0
01 Dec 2023
PIPE : Parallelized Inference Through Post-Training Quantization
  Ensembling of Residual Expansions
PIPE : Parallelized Inference Through Post-Training Quantization Ensembling of Residual Expansions
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
MQ
244
0
0
27 Nov 2023
Efficient Neural Networks for Tiny Machine Learning: A Comprehensive
  Review
Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review
M. Lê
Pierre Wolinski
Julyan Arbel
223
16
0
20 Nov 2023
RepQ: Generalizing Quantization-Aware Training for Re-Parametrized
  Architectures
RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures
Anastasiia Prutianova
Alexey Zaytsev
Chung-Kuei Lee
Fengyu Sun
Ivan Koryakovskiy
MQ
128
0
0
09 Nov 2023
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
LLM-FP4: 4-Bit Floating-Point Quantized TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shih-yang Liu
Zechun Liu
Xijie Huang
Pingcheng Dong
Kwang-Ting Cheng
MQ
176
85
0
25 Oct 2023
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Miaoxi Zhu
Qihuang Zhong
Li Shen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
MQVLM
132
2
0
20 Oct 2023
Robustness-Guided Image Synthesis for Data-Free Quantization
Robustness-Guided Image Synthesis for Data-Free QuantizationAAAI Conference on Artificial Intelligence (AAAI), 2023
Jianhong Bai
Yuchen Yang
Huanpeng Chu
Hualiang Wang
Zuo-Qiang Liu
Ruizhe Chen
Xiaoxuan He
Lianrui Mu
Chengfei Cai
Haoji Hu
DiffMMQ
417
6
0
05 Oct 2023
SINF: Semantic Neural Network Inference with Semantic Subgraphs
SINF: Semantic Neural Network Inference with Semantic Subgraphs
Sazzad Sayyed
Jonathan D. Ashdown
211
0
0
02 Oct 2023
NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free
  Knowledge Distillation
NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge DistillationComputer Vision and Pattern Recognition (CVPR), 2023
Minh-Tuan Tran
Trung Le
Xuan-May Le
Mehrtash Harandi
Quan Hung Tran
Dinh Q. Phung
230
23
0
30 Sep 2023
MixQuant: Mixed Precision Quantization with a Bit-width Optimization
  Search
MixQuant: Mixed Precision Quantization with a Bit-width Optimization Search
Yichen Xie
Wei Le
MQ
130
5
0
29 Sep 2023
Causal-DFQ: Causality Guided Data-free Network Quantization
Causal-DFQ: Causality Guided Data-free Network QuantizationIEEE International Conference on Computer Vision (ICCV), 2023
Yuzhang Shang
Bingxin Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQCML
266
8
0
24 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
176
1
0
20 Sep 2023
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network
  Quantization
SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization
Jinjie Zhang
Rayan Saab
135
0
0
20 Sep 2023
On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks
On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks
Wei Huang
Haotong Qin
Yangdong Liu
Jingzhuo Liang
Yifu Ding
Ying Li
Xianglong Liu
MQ
335
2
0
05 Sep 2023
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D
  Object Detection
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object DetectionIEEE International Conference on Computer Vision (ICCV), 2023
Yifan Zhang
Zhen Dong
Huanrui Yang
Ming Lu
Cheng-Ching Tseng
Yuan Du
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
148
13
0
21 Aug 2023
Unified Data-Free Compression: Pruning and Quantization without
  Fine-Tuning
Unified Data-Free Compression: Pruning and Quantization without Fine-TuningIEEE International Conference on Computer Vision (ICCV), 2023
Shipeng Bai
Jun Chen
Xintian Shen
Yixuan Qian
Yong Liu
MQ
218
20
0
14 Aug 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization
  Search
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng Li
MQ
281
5
0
07 Aug 2023
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara
Sankalp Dayal
Tarqi Afzal
Rahul Bakshi
Kahkuen Fu
MQ
188
0
0
01 Aug 2023
A Model for Every User and Budget: Label-Free and Personalized
  Mixed-Precision Quantization
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision QuantizationInterspeech (Interspeech), 2023
Edward Fish
Umberto Michieli
Mete Ozay
MQ
173
6
0
24 Jul 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision
  Quantization
EMQ: Evolving Training-free Proxies for Automated Mixed Precision QuantizationIEEE International Conference on Computer Vision (ICCV), 2023
Peijie Dong
Lujun Li
Zimian Wei
Xin-Yi Niu
Zhiliang Tian
H. Pan
MQ
199
47
0
20 Jul 2023
Pruning vs Quantization: Which is Better?
Pruning vs Quantization: Which is Better?Neural Information Processing Systems (NeurIPS), 2023
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
272
96
0
06 Jul 2023
Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning
Data-Free Quantization via Mixed-Precision Compensation without Fine-TuningPattern Recognition (Pattern Recogn.), 2023
Jun Chen
Shipeng Bai
Tianxin Huang
Mengmeng Wang
Guanzhong Tian
Y. Liu
MQ
265
24
0
02 Jul 2023
Q-YOLO: Efficient Inference for Real-time Object Detection
Q-YOLO: Efficient Inference for Real-time Object DetectionAsian Conference on Pattern Recognition (ACPR), 2023
Mingze Wang
H. Sun
Jun Shi
Xuhui Liu
Baochang Zhang
Xianbin Cao
ObjD
135
15
0
01 Jul 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Adaptive Sharpness-Aware Pruning for Robust Sparse NetworksInternational Conference on Learning Representations (ICLR), 2023
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
240
17
0
25 Jun 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do NothingNeural Information Processing Systems (NeurIPS), 2023
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
274
120
0
22 Jun 2023
SqueezeLLM: Dense-and-Sparse Quantization
SqueezeLLM: Dense-and-Sparse QuantizationInternational Conference on Machine Learning (ICML), 2023
Sehoon Kim
Coleman Hooper
A. Gholami
Zhen Dong
Xiuyu Li
Sheng Shen
Michael W. Mahoney
Kurt Keutzer
MQ
338
253
0
13 Jun 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision
  Post-Training Quantization
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
207
4
0
08 Jun 2023
Towards Accurate Post-training Quantization for Diffusion Models
Towards Accurate Post-training Quantization for Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
246
34
0
30 May 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
LLM-QAT: Data-Free Quantization Aware Training for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
218
281
0
29 May 2023
Previous
12345
Next