HLAT: High-quality Large Language Model Pre-trained on AWS Trainium

16 April 2024

Papers citing "HLAT: High-quality Large Language Model Pre-trained on AWS Trainium"

3 / 3 papers shown

Title
Modes of Sequence Models and Learning Coefficients Zhongtian Chen Daniel Murfet 77 1 0 25 Apr 2025
Nonuniform-Tensor-Parallelism: Mitigating GPU failure impact for Scaled-up LLM Training Daiyaan Arfeen Dheevatsa Mudigere Ankit More Bhargava Gopireddy Ahmet Inci G. R. Ganger 23 0 0 08 Apr 2025
Stochastic Rounding for LLM Training: Theory and Practice Kaan Ozkara Tao Yu Youngsuk Park 36 0 0 27 Feb 2025