Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.15014
Cited By
Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding
30 June 2022
Connor Holmes
Minjia Zhang
Yuxiong He
Bo Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compressing Pre-trained Transformers via Low-Bit NxM Sparsity for Natural Language Understanding"
6 / 6 papers shown
Title
Speeding Up Question Answering Task of Language Models via Inverted Index
Xiang Ji
Yeşim Sungu-Eryilmaz
Elaheh Momeni
Reza Rawassizadeh
16
1
0
24 Oct 2022
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
93
341
0
05 Jan 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
140
221
0
31 Dec 2020
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
150
345
0
23 Jul 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
575
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1