Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.05723
Cited By
Post-training 4-bit quantization of convolution networks for rapid-deployment
2 October 2018
Ron Banner
Yury Nahshan
Elad Hoffer
Daniel Soudry
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Post-training 4-bit quantization of convolution networks for rapid-deployment"
42 / 42 papers shown
Title
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview
Yanshu Wang
Tong Yang
Xiyan Liang
Guoan Wang
Hanning Lu
Xu Zhe
Yaoming Li
Li Weitao
MQ
34
3
0
18 Sep 2024
ISQuant: apply squant to the real deployment
Dezan Zhao
MQ
19
0
0
05 Jul 2024
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng R. Li
Mohamed S. Abdelfattah
Zhiru Zhang
26
8
0
06 May 2024
HEANA: A Hybrid Time-Amplitude Analog Optical Accelerator with Flexible Dataflows for Energy-Efficient CNN Inference
Sairam Sri Vatsavai
Venkata Sai Praneeth Karempudi
Ishan G. Thakkar
19
0
0
05 Feb 2024
Linear Log-Normal Attention with Unbiased Concentration
Yury Nahshan
Dor-Joseph Kampeas
E. Haleva
22
7
0
22 Nov 2023
Robustness-Guided Image Synthesis for Data-Free Quantization
Jianhong Bai
Yuchen Yang
Huanpeng Chu
Hualiang Wang
Zuo-Qiang Liu
Ruizhe Chen
Xiaoxuan He
Lianrui Mu
Chengfei Cai
Haoji Hu
DiffM
MQ
26
5
0
05 Oct 2023
Causal-DFQ: Causality Guided Data-free Network Quantization
Yuzhang Shang
Bingxin Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQ
CML
16
8
0
24 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
33
1
0
20 Sep 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
28
2
0
08 Jun 2023
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Ruihao Gong
Jian Ren
Zhengang Li
MQ
19
31
0
18 Apr 2023
ACQ: Improving Generative Data-free Quantization Via Attention Correction
Jixing Li
Xiaozhou Guo
Benzhe Dai
Guoliang Gong
Min Jin
Gang Chen
Wenyu Mao
Huaxiang Lu
MQ
30
4
0
18 Jan 2023
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
21
6
0
05 Dec 2022
Partial Binarization of Neural Networks for Budget-Aware Efficient Learning
Udbhav Bamba
Neeraj Anand
Saksham Aggarwal
Dilip K Prasad
D. K. Gupta
MQ
8
0
0
12 Nov 2022
Empirical Evaluation of Post-Training Quantization Methods for Language Tasks
Ting Hu
Christoph Meinel
Haojin Yang
MQ
28
3
0
29 Oct 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
25
3
0
25 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
14
1
0
31 Jul 2022
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime
Saad Ashfaq
Mohammadhossein Askarihemmat
Sudhakar Sah
Ehsan Saboori
Olivier Mastropietro
Alexander Hoffman
BDL
MQ
13
4
0
18 Jul 2022
Wavelet Feature Maps Compression for Image-to-Image CNNs
Shahaf E. Finder
Yair Zohav
Maor Ashkenazi
Eran Treister
14
17
0
24 May 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
18
6
0
22 Mar 2022
UWC: Unit-wise Calibration Towards Rapid Network Compression
Chen Lin
Zheyang Li
Bo Peng
Haoji Hu
Wenming Tan
Ye Ren
Shiliang Pu
MQ
21
1
0
17 Jan 2022
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Weixin Xu
Zipeng Feng
Shuangkang Fang
Song Yuan
Yi Yang
Shuchang Zhou
MQ
18
1
0
01 Nov 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
AutoReCon: Neural Architecture Search-based Reconstruction for Data-free Compression
Baozhou Zhu
P. Hofstee
J. Peltenburg
Jinho Lee
Zaid Al-Ars
14
22
0
25 May 2021
AirNet: Neural Network Transmission over the Air
Mikolaj Jankowski
Deniz Gunduz
K. Mikolajczyk
60
1
0
24 May 2021
Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
R. Cohen
Hyomin Choi
Ivan V. Bajić
14
23
0
15 May 2021
Lightweight compression of neural network feature tensors for collaborative intelligence
R. Cohen
Hyomin Choi
Ivan V. Bajić
13
42
0
12 May 2021
Zero-shot Adversarial Quantization
Yuang Liu
Wei Zhang
Jun Wang
MQ
11
77
0
29 Mar 2021
Generative Zero-shot Network Quantization
Xiangyu He
Qinghao Hu
Peisong Wang
Jian Cheng
GAN
MQ
23
23
0
21 Jan 2021
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis
Shachar Gluska
Mark Grobman
MQ
14
5
0
15 Dec 2020
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks
Jun Nishikawa
Ryoji Ikegaya
MQ
18
1
0
13 Nov 2020
Dual Precision Deep Neural Network
J. Park
J. Choi
J. Ko
6
1
0
02 Sep 2020
EasyQuant: Post-training Quantization via Scale Optimization
Di Wu
Qingming Tang
Yongle Zhao
Ming Zhang
Ying Fu
Debing Zhang
MQ
19
75
0
30 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
21
122
0
14 Jun 2020
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Generative Low-bitwidth Data Free Quantization
Shoukai Xu
Haokun Li
Bohan Zhuang
Jing Liu
Jiezhang Cao
Chuangrun Liang
Mingkui Tan
MQ
13
126
0
07 Mar 2020
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu
Ruihao Gong
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
27
151
0
29 Dec 2019
Loss Aware Post-training Quantization
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
17
163
0
17 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables
Hui Guan
Andrey Malevich
Jiyan Yang
Jongsoo Park
Hector Yuen
MQ
11
31
0
05 Nov 2019
Improving Noise Tolerance of Mixed-Signal Neural Networks
M. Klachko
M. Mahmoodi
D. Strukov
6
29
0
02 Apr 2019
Low-bit Quantization of Neural Networks for Efficient Inference
Yoni Choukroun
Eli Kravchik
Fan Yang
P. Kisilev
MQ
16
355
0
18 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Eldad Meller
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
11
85
0
05 Feb 2019
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
OODD
MQ
33
304
0
28 Jan 2019
1