Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.17224
Cited By
Training and inference of large language models using 8-bit floating point
29 September 2023
Sergio P. Perez
Yan Zhang
James Briggs
Charlie Blake
P. Krishnamurthy
Paul Balanca
Carlo Luschi
Stephen Barlow
Andrew William Fitzgibbon
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training and inference of large language models using 8-bit floating point"
13 / 13 papers shown
Title
Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks
Erin Carson
Xinye Chen
49
0
0
10 Apr 2025
Accurate INT8 Training Through Dynamic Block-Level Fallback
Pengle Zhang
Jia wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
74
3
0
13 Mar 2025
Ascend HiFloat8 Format for Deep Learning
Yuanyong Luo
Zhongxing Zhang
Richard Wu
Hu Liu
Ying Jin
...
Korviakov Vladimir
Bobrin Maxim
Yuhao Hu
Guanfu Chen
Zeyi Huang
MQ
25
1
0
25 Sep 2024
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Yuezhou Hu
Jun-Jie Zhu
Jianfei Chen
36
0
0
13 Sep 2024
u-
μ
\mu
μ
P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
53
9
0
24 Jul 2024
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge J. Belongie
Vésteinn Snæbjarnarson
MQ
34
3
0
26 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
35
0
0
09 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Anoop Deoras
Jun Huan
MQ
49
3
0
06 May 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
37
20
0
19 Mar 2024
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
Shivam Aggarwal
Hans Jakob Damsgaard
Alessandro Pappalardo
Giuseppe Franco
Thomas B. Preußer
Michaela Blott
Tulika Mitra
MQ
19
5
0
21 Nov 2023
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation
Josh Alman
Zhao-quan Song
24
31
0
06 Oct 2023
FP8 Formats for Deep Learning
Paulius Micikevicius
Dusan Stosic
N. Burgess
Marius Cornea
Pradeep Dubey
...
Naveen Mellempudi
S. Oberman
M. Shoeybi
Michael Siu
Hao Wu
BDL
VLM
MQ
67
121
0
12 Sep 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1