Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.17951
Cited By
FP8 versus INT8 for efficient deep learning inference
31 March 2023
M. V. Baalen
Andrey Kuzmin
Suparna S. Nair
Yuwei Ren
E. Mahurin
Chirag I. Patel
Sundar Subramanian
Sanghyuk Lee
Markus Nagel
Joseph B. Soriaga
Tijmen Blankevoort
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FP8 versus INT8 for efficient deep learning inference"
27 / 27 papers shown
Title
Improving Quantization with Post-Training Model Expansion
Giuseppe Franco
Pablo Monteagudo-Lago
Ian Colbert
Nicholas J. Fraser
Michaela Blott
MQ
57
1
0
21 Mar 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
50
0
0
19 Mar 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
80
0
0
18 Feb 2025
INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
Shimao Chen
Zirui Liu
Zhiying Wu
Ce Zheng
Peizhuang Cong
Zihan Jiang
Yuhan Wu
Lei Su
Tong Yang
MQ
VLM
39
3
0
25 Sep 2024
Small Language Models: Survey, Measurements, and Insights
Zhenyan Lu
Xiang Li
Dongqi Cai
Rongjie Yi
Fangming Liu
Xiwen Zhang
Nicholas D. Lane
Mengwei Xu
ObjD
LRM
49
31
0
24 Sep 2024
Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things
Ziheng Wang
Pedro Reviriego
Farzad Niknia
Javier Conde
Shanshan Liu
Fabrizio Lombardi
MQ
27
2
0
26 Aug 2024
Scalify: scale propagation for efficient low-precision LLM training
Paul Balança
Sam Hosegood
Carlo Luschi
Andrew Fitzgibbon
16
2
0
24 Jul 2024
Procrastination Is All You Need: Exponent Indexed Accumulators for Floating Point, Posits and Logarithmic Numbers
Vincenzo Liguori
18
0
0
09 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
22
4
0
31 May 2024
PlanNetX: Learning an Efficient Neural Network Planner from MPC for Longitudinal Control
Jasper Hoffmann
Diego Fernandez Clausen
Julien Brosseit
Julian Bernhard
Klemens Esterle
M. Werling
Michael Karg
Joschka Boedecker
16
1
0
29 Apr 2024
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
Aniruddha Nrusimha
Mayank Mishra
Naigang Wang
Dan Alistarh
Rameswar Panda
Yoon Kim
MQ
54
8
0
04 Apr 2024
Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Yaniv Blumenfeld
Itay Hubara
Daniel Soudry
29
3
0
25 Jan 2024
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Xiaoxia Wu
Haojun Xia
Stephen Youn
Zhen Zheng
Shiyang Chen
...
Reza Yazdani Aminabadi
Yuxiong He
Olatunji Ruwase
Leon Song
Zhewei Yao
63
8
0
14 Dec 2023
Look-Up mAI GeMM: Increasing AI GeMMs Performance by Nearly 2.5x via msGeMM
Saeed Maleki
VLM
13
4
0
09 Oct 2023
Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
Luoming Zhang
Wen Fei
Weijia Wu
Yefei He
Zhenyu Lou
Hong Zhou
MQ
11
5
0
07 Oct 2023
Hadamard Domain Training with Integers for Class Incremental Quantized Learning
Martin Schiemer
Clemens J. S. Schaefer
Jayden Parker Vap
Mark Horeni
Yu Emma Wang
Juan Ye
Siddharth Joshi
31
2
0
05 Oct 2023
Training and inference of large language models using 8-bit floating point
Sergio P. Perez
Yan Zhang
James Briggs
Charlie Blake
P. Krishnamurthy
Paul Balanca
Carlo Luschi
Stephen Barlow
Andrew William Fitzgibbon
MQ
14
18
0
29 Sep 2023
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
Xiaoxia Wu
Z. Yao
Yuxiong He
MQ
21
43
0
19 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
22
60
0
16 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
6
84
0
22 Jun 2023
Accuracy Booster: Enabling 4-bit Fixed-point Arithmetic for DNN Training
Simla Burcu Harma
Canberk Sonmez
Nicholas Sperry
Babak Falsafi
Martin Jaggi
Yunho Oh
MQ
19
4
0
19 Nov 2022
FP8 Formats for Deep Learning
Paulius Micikevicius
Dusan Stosic
N. Burgess
Marius Cornea
Pradeep Dubey
...
Naveen Mellempudi
S. Oberman
M. Shoeybi
Michael Siu
Hao Wu
BDL
VLM
MQ
67
119
0
12 Sep 2022
Overcoming Oscillations in Quantization-Aware Training
Markus Nagel
Marios Fournarakis
Yelysei Bondarenko
Tijmen Blankevoort
MQ
106
70
0
21 Mar 2022
Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers
Yukuan Yang
Shuang Wu
Lei Deng
Tianyi Yan
Yuan Xie
Guoqi Li
MQ
99
108
0
05 Sep 2019
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
190
3,480
0
20 Aug 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1