ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.09504
  4. Cited By
Improving Neural Network Quantization without Retraining using Outlier
  Channel Splitting

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

28 January 2019
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
    OODD
    MQ
ArXivPDFHTML

Papers citing "Improving Neural Network Quantization without Retraining using Outlier Channel Splitting"

50 / 174 papers shown
Title
Outlier Suppression+: Accurate quantization of large language models by
  equivalent and optimal shifting and scaling
Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Ruihao Gong
Jian Ren
Zhengang Li
MQ
19
31
0
18 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
29
78
0
03 Apr 2023
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with
  Bridge Block Reconstruction for IoT Systems
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems
Jemin Lee
Yongin Kwon
Sihyeong Park
Misun Yu
Jeman Park
Hwanjun Song
ViT
MQ
14
5
0
22 Mar 2023
Efficient Transformer-based 3D Object Detection with Dynamic Token
  Halting
Efficient Transformer-based 3D Object Detection with Dynamic Token Halting
Mao Ye
Gregory P. Meyer
Yuning Chai
Qiang Liu
32
8
0
09 Mar 2023
Rotation Invariant Quantization for Model Compression
Rotation Invariant Quantization for Model Compression
Dor-Joseph Kampeas
Yury Nahshan
Hanoch Kremer
Gil Lederman
Shira Zaloshinski
Zheng Li
E. Haleva
MQ
16
0
0
03 Mar 2023
Q-Diffusion: Quantizing Diffusion Models
Q-Diffusion: Quantizing Diffusion Models
Xiuyu Li
Yijia Liu
Long Lian
Hua Yang
Zhen Dong
Daniel Kang
Shanghang Zhang
Kurt Keutzer
DiffM
MQ
34
152
0
08 Feb 2023
PowerQuant: Automorphism Search for Non-Uniform Quantization
PowerQuant: Automorphism Search for Non-Uniform Quantization
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
MQ
15
15
0
24 Jan 2023
ACQ: Improving Generative Data-free Quantization Via Attention
  Correction
ACQ: Improving Generative Data-free Quantization Via Attention Correction
Jixing Li
Xiaozhou Guo
Benzhe Dai
Guoliang Gong
Min Jin
Gang Chen
Wenyu Mao
Huaxiang Lu
MQ
30
4
0
18 Jan 2023
Automatic Network Adaptation for Ultra-Low Uniform-Precision
  Quantization
Automatic Network Adaptation for Ultra-Low Uniform-Precision Quantization
Seongmin Park
Beomseok Kwon
Jieun Lim
Kyuyoung Sim
Taeho Kim
Jungwook Choi
MQ
6
1
0
21 Dec 2022
PD-Quant: Post-Training Quantization based on Prediction Difference
  Metric
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
Jiawei Liu
Lin Niu
Zhihang Yuan
Dawei Yang
Xinggang Wang
Wenyu Liu
MQ
96
68
0
14 Dec 2022
QFT: Post-training quantization via fast joint finetuning of all degrees
  of freedom
QFT: Post-training quantization via fast joint finetuning of all degrees of freedom
Alexander Finkelstein
Ella Fuchs
Idan Tal
Mark Grobman
Niv Vosco
Eldad Meller
MQ
21
6
0
05 Dec 2022
Quadapter: Adapter for GPT-2 Quantization
Quadapter: Adapter for GPT-2 Quantization
Minseop Park
J. You
Markus Nagel
Simyung Chang
MQ
21
9
0
30 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large
  Language Models
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
61
731
0
18 Nov 2022
QuantPipe: Applying Adaptive Post-Training Quantization for Distributed
  Transformer Pipelines in Dynamic Edge Environments
QuantPipe: Applying Adaptive Post-Training Quantization for Distributed Transformer Pipelines in Dynamic Edge Environments
Hong Wang
Connor Imes
Souvik Kundu
P. Beerel
S. Crago
J. Walters
MQ
10
7
0
08 Nov 2022
Empirical Evaluation of Post-Training Quantization Methods for Language
  Tasks
Empirical Evaluation of Post-Training Quantization Methods for Language Tasks
Ting Hu
Christoph Meinel
Haojin Yang
MQ
28
3
0
29 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of
  Large-Scale Pre-Trained Language Models
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
21
30
0
08 Oct 2022
Approximate Computing and the Efficient Machine Learning Expedition
Approximate Computing and the Efficient Machine Learning Expedition
J. Henkel
Hai Helen Li
A. Raghunathan
M. Tahoori
Swagath Venkataramani
Xiaoxuan Yang
Georgios Zervakis
11
17
0
02 Oct 2022
A simple approach for quantizing neural networks
A simple approach for quantizing neural networks
J. Maly
Rayan Saab
MQ
22
11
0
07 Sep 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
25
3
0
25 Aug 2022
Context-Aware Streaming Perception in Dynamic Environments
Context-Aware Streaming Perception in Dynamic Environments
Gur-Eyal Sela
Ionel Gog
J. Wong
Kumar Krishna Agrawal
Xiangxi Mo
...
Eric Leong
Xin Wang
Bharathan Balaji
Joseph E. Gonzalez
Ion Stoica
10
9
0
16 Aug 2022
Combining Gradients and Probabilities for Heterogeneous Approximation of
  Neural Networks
Combining Gradients and Probabilities for Heterogeneous Approximation of Neural Networks
E. Trommer
Bernd Waschneck
Akash Kumar
13
6
0
15 Aug 2022
Mixed-Precision Neural Networks: A Survey
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
18
11
0
11 Aug 2022
Symmetry Regularization and Saturating Nonlinearity for Robust
  Quantization
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization
Sein Park
Yeongsang Jang
Eunhyeok Park
MQ
14
1
0
31 Jul 2022
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient
  Inference in Large-Scale Generative Language Models
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
Gunho Park
Baeseong Park
Minsub Kim
Sungjae Lee
Jeonghoon Kim
Beomseok Kwon
S. Kwon
Byeongwook Kim
Youngjoo Lee
Dongsoo Lee
MQ
13
73
0
20 Jun 2022
LilNetX: Lightweight Networks with EXtreme Model Compression and
  Structured Sparsification
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
Sharath Girish
Kamal Gupta
Saurabh Singh
Abhinav Shrivastava
28
11
0
06 Apr 2022
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the
  Teacher
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher
Kanghyun Choi
Hye Yoon Lee
Deokki Hong
Joonsang Yu
Noseong Park
Youngsok Kim
Jinho Lee
MQ
17
31
0
31 Mar 2022
A Fast Post-Training Pruning Framework for Transformers
A Fast Post-Training Pruning Framework for Transformers
Woosuk Kwon
Sehoon Kim
Michael W. Mahoney
Joseph Hassoun
Kurt Keutzer
A. Gholami
18
143
0
29 Mar 2022
To Fold or Not to Fold: a Necessary and Sufficient Condition on
  Batch-Normalization Layers Folding
To Fold or Not to Fold: a Necessary and Sufficient Condition on Batch-Normalization Layers Folding
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
27
7
0
28 Mar 2022
REx: Data-Free Residual Quantization Error Expansion
REx: Data-Free Residual Quantization Error Expansion
Edouard Yvinec
Arnaud Dapgony
Matthieu Cord
Kévin Bailly
MQ
26
8
0
28 Mar 2022
SPIQ: Data-Free Per-Channel Static Input Quantization
SPIQ: Data-Free Per-Channel Static Input Quantization
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
MQ
11
18
0
28 Mar 2022
An Empirical Study of Low Precision Quantization for TinyML
An Empirical Study of Low Precision Quantization for TinyML
Shaojie Zhuo
Hongyu Chen
R. Ramakrishnan
Tommy Chen
Chen Feng
Yi-Rung Lin
Parker Zhang
Liang Shen
MQ
29
13
0
10 Mar 2022
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian
  Approximation
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
Cong Guo
Yuxian Qiu
Jingwen Leng
Xiaotian Gao
Chen Zhang
Yunxin Liu
Fan Yang
Yuhao Zhu
Minyi Guo
MQ
63
70
0
14 Feb 2022
Quantune: Post-training Quantization of Convolutional Neural Networks
  using Extreme Gradient Boosting for Fast Deployment
Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment
Jemin Lee
Misun Yu
Yongin Kwon
Teaho Kim
MQ
17
17
0
10 Feb 2022
Post-training Quantization for Neural Networks with Provable Guarantees
Post-training Quantization for Neural Networks with Provable Guarantees
Jinjie Zhang
Yixuan Zhou
Rayan Saab
MQ
18
31
0
26 Jan 2022
Q-ViT: Fully Differentiable Quantization for Vision Transformer
Q-ViT: Fully Differentiable Quantization for Vision Transformer
Zhexin Li
Tong Yang
Peisong Wang
Jian Cheng
ViT
MQ
23
41
0
19 Jan 2022
UWC: Unit-wise Calibration Towards Rapid Network Compression
UWC: Unit-wise Calibration Towards Rapid Network Compression
Chen Lin
Zheyang Li
Bo Peng
Haoji Hu
Wenming Tan
Ye Ren
Shiliang Pu
MQ
19
1
0
17 Jan 2022
Training Quantized Deep Neural Networks via Cooperative Coevolution
Training Quantized Deep Neural Networks via Cooperative Coevolution
Fu Peng
Shengcai Liu
Ning Lu
Ke Tang
MQ
16
1
0
23 Dec 2021
A Generalized Zero-Shot Quantization of Deep Convolutional Neural
  Networks via Learned Weights Statistics
A Generalized Zero-Shot Quantization of Deep Convolutional Neural Networks via Learned Weights Statistics
Prasen Kumar Sharma
Arun Abraham
V. N. Rajendiran
MQ
25
7
0
06 Dec 2021
HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving
  Generalization and Quantization Performance
HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance
Huanrui Yang
Xiaoxuan Yang
Neil Zhenqiang Gong
Yiran Chen
MQ
6
10
0
23 Nov 2021
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
Jiangchao Yao
Shengyu Zhang
Yang Yao
Feng Wang
Jianxin Ma
...
Kun Kuang
Chao-Xiang Wu
Fei Wu
Jingren Zhou
Hongxia Yang
16
91
0
11 Nov 2021
MQBench: Towards Reproducible and Deployable Model Quantization
  Benchmark
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li
Mingzhu Shen
Jian Ma
Yan Ren
Mingxin Zhao
Qi Zhang
Ruihao Gong
F. Yu
Junjie Yan
MQ
35
49
0
05 Nov 2021
Qimera: Data-free Quantization with Synthetic Boundary Supporting
  Samples
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples
Kanghyun Choi
Deokki Hong
Noseong Park
Youngsok Kim
Jinho Lee
MQ
14
64
0
04 Nov 2021
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Weixin Xu
Zipeng Feng
Shuangkang Fang
Song Yuan
Yi Yang
Shuchang Zhou
MQ
16
1
0
01 Nov 2021
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving
  Adversarial Outcomes
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
Sanghyun Hong
Michael-Andrei Panaitescu-Liess
Yigitcan Kaya
Tudor Dumitras
MQ
52
13
0
26 Oct 2021
Applications and Techniques for Fast Machine Learning in Science
Applications and Techniques for Fast Machine Learning in Science
A. Deiana
Nhan Tran
Joshua C. Agar
Michaela Blott
G. D. Guglielmo
...
Ashish Sharma
S. Summers
Pietro Vischia
J. Vlimant
Olivia Weng
8
71
0
25 Oct 2021
Towards Efficient Post-training Quantization of Pre-trained Language
  Models
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
71
47
0
30 Sep 2021
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting
  and Output Merging
RED++ : Data-Free Pruning of Deep Neural Networks via Input Splitting and Output Merging
Edouard Yvinec
Arnaud Dapogny
Matthieu Cord
Kévin Bailly
12
15
0
30 Sep 2021
HPTQ: Hardware-Friendly Post Training Quantization
HPTQ: Hardware-Friendly Post Training Quantization
H. Habi
Reuven Peretz
Elad Cohen
Lior Dikstein
Oranit Dror
I. Diamant
Roy H. Jennings
Arnon Netzer
MQ
26
8
0
19 Sep 2021
Diverse Sample Generation: Pushing the Limit of Generative Data-free
  Quantization
Diverse Sample Generation: Pushing the Limit of Generative Data-free Quantization
Haotong Qin
Yifu Ding
Xiangguo Zhang
Jiakai Wang
Xianglong Liu
Jiwen Lu
DiffM
MQ
16
49
0
01 Sep 2021
Quantized Convolutional Neural Networks Through the Lens of Partial
  Differential Equations
Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations
Ido Ben-Yair
Gil Ben Shalom
Moshe Eliasof
Eran Treister
MQ
16
5
0
31 Aug 2021
Previous
1234
Next