Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2208.07339
Cited By
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
15 August 2022
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale"
4 / 104 papers shown
Title
All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality
William Timkey
Marten van Schijndel
213
110
0
09 Sep 2021
FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference
D. Khudia
Jianyu Huang
Protonu Basu
Summer Deng
Haixin Liu
Jongsoo Park
M. Smelyanskiy
FedML
MQ
35
44
0
13 Jan 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
221
0
31 Dec 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
574
0
12 Sep 2019
Previous
1
2
3