Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.07453
Cited By
Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
16 September 2020
Insoo Chung
Byeongwook Kim
Yoonjung Choi
S. Kwon
Yongkweon Jeon
Baeseong Park
Sangha Kim
Dongsoo Lee
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation"
3 / 3 papers shown
Title
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
23
30
0
08 Oct 2022
Bag of Tricks for Optimizing Transformer Efficiency
Ye Lin
Yanyang Li
Tong Xiao
Jingbo Zhu
21
6
0
09 Sep 2021
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
227
575
0
12 Sep 2019
1