ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.14592
  4. Cited By
Efficient Post-training Quantization with FP8 Formats

Efficient Post-training Quantization with FP8 Formats

26 September 2023
Haihao Shen
Naveen Mellempudi
Xin He
Q. Gao
Chang‐Bao Wang
Mengni Wang
    MQ
ArXivPDFHTML

Papers citing "Efficient Post-training Quantization with FP8 Formats"

6 / 6 papers shown
Title
Precision Where It Matters: A Novel Spike Aware Mixed-Precision Quantization Strategy for LLaMA-based Language Models
Precision Where It Matters: A Novel Spike Aware Mixed-Precision Quantization Strategy for LLaMA-based Language Models
Lucas Maisonnave
Cyril Moineau
Olivier Bichler
Fabrice Rastello
MQ
69
1
0
30 Apr 2025
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
29
0
0
01 Nov 2024
Learning from Students: Applying t-Distributions to Explore Accurate and
  Efficient Formats for LLMs
Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs
Jordan Dotzel
Yuzong Chen
Bahaa Kotb
Sushma Prasad
Gang Wu
Sheng R. Li
Mohamed S. Abdelfattah
Zhiru Zhang
21
7
0
06 May 2024
FP8 Formats for Deep Learning
FP8 Formats for Deep Learning
Paulius Micikevicius
Dusan Stosic
N. Burgess
Marius Cornea
Pradeep Dubey
...
Naveen Mellempudi
S. Oberman
M. Shoeybi
Michael Siu
Hao Wu
BDL
VLM
MQ
67
119
0
12 Sep 2022
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
86
332
0
05 Jan 2021
Incremental Network Quantization: Towards Lossless CNNs with
  Low-Precision Weights
Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights
Aojun Zhou
Anbang Yao
Yiwen Guo
Lin Xu
Yurong Chen
MQ
300
1,046
0
10 Feb 2017
1