ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.18710
  4. Cited By
To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability

To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability

29 May 2024
Joonhyung Lee
Jeongin Bae
Byeongwook Kim
S. Kwon
Dongsoo Lee
    MQ
ArXivPDFHTML

Papers citing "To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability"

2 / 2 papers shown
Title
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Tianjin Huang
Haotian Hu
Zhenyu (Allen) Zhang
Gaojie Jin
X. Li
...
Tianlong Chen
Lu Liu
Qingsong Wen
Zhangyang Wang
Shiwei Liu
MQ
33
0
0
24 Feb 2025
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1