ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.05723
  4. Cited By
Post-training 4-bit quantization of convolution networks for
  rapid-deployment

Post-training 4-bit quantization of convolution networks for rapid-deployment

2 October 2018
Ron Banner
Yury Nahshan
Elad Hoffer
Daniel Soudry
    MQ
ArXivPDFHTML

Papers citing "Post-training 4-bit quantization of convolution networks for rapid-deployment"

42 / 42 papers shown
Title
Art and Science of Quantizing Large-Scale Models: A Comprehensive
  Overview
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview
Yanshu Wang
Tong Yang
Xiyan Liang
Guoan Wang
Hanning Lu
Xu Zhe
Yaoming Li
Li Weitao
MQ
34
3
0
18 Sep 2024
ISQuant: apply squant to the real deployment
ISQuant: apply squant to the real deployment
Dezan Zhao
MQ
19
0
0
05 Jul 2024
Learning from Students: Applying t-Distributions to Explore Accurate and
  Efficient Formats for LLMs
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng R. Li
Mohamed S. Abdelfattah
Zhiru Zhang
26
8
0
06 May 2024
HEANA: A Hybrid Time-Amplitude Analog Optical Accelerator with Flexible
  Dataflows for Energy-Efficient CNN Inference
HEANA: A Hybrid Time-Amplitude Analog Optical Accelerator with Flexible Dataflows for Energy-Efficient CNN Inference
Sairam Sri Vatsavai
Venkata Sai Praneeth Karempudi
Ishan G. Thakkar
19
0
0
05 Feb 2024
Linear Log-Normal Attention with Unbiased Concentration
Linear Log-Normal Attention with Unbiased Concentration
Yury Nahshan
Dor-Joseph Kampeas
E. Haleva
22
7
0
22 Nov 2023
Robustness-Guided Image Synthesis for Data-Free Quantization
Robustness-Guided Image Synthesis for Data-Free Quantization
Jianhong Bai
Yuchen Yang
Huanpeng Chu
Hualiang Wang
Zuo-Qiang Liu
Ruizhe Chen
Xiaoxuan He
Lianrui Mu
Chengfei Cai
Haoji Hu
DiffM
MQ
26
5
0
05 Oct 2023
Causal-DFQ: Causality Guided Data-free Network Quantization
Causal-DFQ: Causality Guided Data-free Network Quantization
Yuzhang Shang
Bingxin Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQ
CML
16
8
0
24 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
33
1
0
20 Sep 2023
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision
  Post-Training Quantization
Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization
Clemens J. S. Schaefer
Navid Lambert-Shirzad
Xiaofan Zhang
Chia-Wei Chou
T. Jablin
Jian Li
Elfie Guo
Caitlin Stanton
S. Joshi
Yu Emma Wang
MQ
28
2
0
08 Jun 2023
Outlier Suppression+: Accurate quantization of large language models by
  equivalent and optimal shifting and scaling
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Ruihao Gong
Jian Ren
Zhengang Li
MQ
19
31
0
18 Apr 2023
ACQ: Improving Generative Data-free Quantization Via Attention
  Correction
ACQ: Improving Generative Data-free Quantization Via Attention Correction
Jixing Li
Xiaozhou Guo
Benzhe Dai
Guoliang Gong
Min Jin
Gang Chen
Wenyu Mao
Huaxiang Lu
MQ
30
4
0
18 Jan 2023
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
21
6
0
05 Dec 2022
Partial Binarization of Neural Networks for Budget-Aware Efficient
  Learning
Partial Binarization of Neural Networks for Budget-Aware Efficient Learning
Udbhav Bamba
Neeraj Anand
Saksham Aggarwal
Dilip K Prasad
D. K. Gupta
MQ
8
0
0
12 Nov 2022
Empirical Evaluation of Post-Training Quantization Methods for Language
  Tasks
Empirical Evaluation of Post-Training Quantization Methods for Language Tasks
Ting Hu
Christoph Meinel
Haojin Yang
MQ
28
3
0
29 Oct 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
25
3
0
25 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust
  Quantization
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
14
1
0
31 Jul 2022
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low
  Bit Quantization and Runtime
Accelerating Deep Learning Model Inference on Arm CPUs with Ultra-Low Bit Quantization and Runtime
Saad Ashfaq
Mohammadhossein Askarihemmat
Sudhakar Sah
Ehsan Saboori
Olivier Mastropietro
Alexander Hoffman
BDL
MQ
13
4
0
18 Jul 2022
Wavelet Feature Maps Compression for Image-to-Image CNNs
Wavelet Feature Maps Compression for Image-to-Image CNNs
Shahaf E. Finder
Yair Zohav
Maor Ashkenazi
Eran Treister
14
17
0
24 May 2022
FxP-QNet: A Post-Training Quantizer for the Design of Mixed
  Low-Precision DNNs with Dynamic Fixed-Point Representation
FxP-QNet: A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation
Ahmad Shawahna
S. M. Sait
A. El-Maleh
Irfan Ahmad
MQ
18
6
0
22 Mar 2022
UWC: Unit-wise Calibration Towards Rapid Network Compression
UWC: Unit-wise Calibration Towards Rapid Network Compression
Chen Lin
Zheyang Li
Bo Peng
Haoji Hu
Wenming Tan
Ye Ren
Shiliang Pu
MQ
21
1
0
17 Jan 2022
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Weixin Xu
Zipeng Feng
Shuangkang Fang
Song Yuan
Yi Yang
Shuchang Zhou
MQ
18
1
0
01 Nov 2021
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Auto-Split: A General Framework of Collaborative Edge-Cloud AI
Amin Banitalebi-Dehkordi
Naveen Vedula
J. Pei
Fei Xia
Lanjun Wang
Yong Zhang
22
89
0
30 Aug 2021
AutoReCon: Neural Architecture Search-based Reconstruction for Data-free
  Compression
AutoReCon: Neural Architecture Search-based Reconstruction for Data-free Compression
Baozhou Zhu
P. Hofstee
J. Peltenburg
Jinho Lee
Zaid Al-Ars
14
22
0
25 May 2021
AirNet: Neural Network Transmission over the Air
AirNet: Neural Network Transmission over the Air
Mikolaj Jankowski
Deniz Gunduz
K. Mikolajczyk
60
1
0
24 May 2021
Lightweight Compression of Intermediate Neural Network Features for
  Collaborative Intelligence
Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
R. Cohen
Hyomin Choi
Ivan V. Bajić
14
23
0
15 May 2021
Lightweight compression of neural network feature tensors for
  collaborative intelligence
Lightweight compression of neural network feature tensors for collaborative intelligence
R. Cohen
Hyomin Choi
Ivan V. Bajić
13
42
0
12 May 2021
Zero-shot Adversarial Quantization
Zero-shot Adversarial Quantization
Yuang Liu
Wei Zhang
Jun Wang
MQ
11
77
0
29 Mar 2021
Generative Zero-shot Network Quantization
Generative Zero-shot Network Quantization
Xiangyu He
Qinghao Hu
Peisong Wang
Jian Cheng
GAN
MQ
23
23
0
21 Jan 2021
Exploring Neural Networks Quantization via Layer-Wise Quantization
  Analysis
Exploring Neural Networks Quantization via Layer-Wise Quantization Analysis
Shachar Gluska
Mark Grobman
MQ
14
5
0
15 Dec 2020
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural
  Networks
Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks
Jun Nishikawa
Ryoji Ikegaya
MQ
18
1
0
13 Nov 2020
Dual Precision Deep Neural Network
Dual Precision Deep Neural Network
J. Park
J. Choi
J. Ko
6
1
0
02 Sep 2020
EasyQuant: Post-training Quantization via Scale Optimization
EasyQuant: Post-training Quantization via Scale Optimization
Di Wu
Qingming Tang
Yongle Zhao
Ming Zhang
Ying Fu
Debing Zhang
MQ
19
75
0
30 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and
  Integer Programming
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
21
122
0
14 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Generative Low-bitwidth Data Free Quantization
Generative Low-bitwidth Data Free Quantization
Shoukai Xu
Haokun Li
Bohan Zhuang
Jing Liu
Jiezhang Cao
Chuangrun Liang
Mingkui Tan
MQ
13
126
0
07 Mar 2020
Towards Unified INT8 Training for Convolutional Neural Network
Towards Unified INT8 Training for Convolutional Neural Network
Feng Zhu
Ruihao Gong
F. Yu
Xianglong Liu
Yanfei Wang
Zhelong Li
Xiuqi Yang
Junjie Yan
MQ
27
151
0
29 Dec 2019
Loss Aware Post-training Quantization
Loss Aware Post-training Quantization
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
A. Bronstein
A. Mendelson
MQ
17
163
0
17 Nov 2019
Post-Training 4-bit Quantization on Embedding Tables
Post-Training 4-bit Quantization on Embedding Tables
Hui Guan
Andrey Malevich
Jiyan Yang
Jongsoo Park
Hector Yuen
MQ
11
31
0
05 Nov 2019
Improving Noise Tolerance of Mixed-Signal Neural Networks
Improving Noise Tolerance of Mixed-Signal Neural Networks
M. Klachko
M. Mahmoodi
D. Strukov
6
29
0
02 Apr 2019
Low-bit Quantization of Neural Networks for Efficient Inference
Low-bit Quantization of Neural Networks for Efficient Inference
Yoni Choukroun
Eli Kravchik
Fan Yang
P. Kisilev
MQ
16
355
0
18 Feb 2019
Same, Same But Different - Recovering Neural Network Quantization Error
  Through Weight Factorization
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Eldad Meller
Alexander Finkelstein
Uri Almog
Mark Grobman
MQ
11
85
0
05 Feb 2019
Improving Neural Network Quantization without Retraining using Outlier
  Channel Splitting
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
OODD
MQ
33
304
0
28 Jan 2019
1