
Title |
|---|
![]() Root Mean Square Layer NormalizationNeural Information Processing Systems (NeurIPS), 2019 |
![]() HellaSwag: Can a Machine Really Finish Your Sentence?Annual Meeting of the Association for Computational Linguistics (ACL), 2019 |
![]() Mixed Precision Training Paulius Micikevicius Sharan Narang Jonah Alben G. Diamos Erich Elsen ...Boris Ginsburg Michael Houston Oleksii Kuchaiev Ganesh Venkatesh Hao Wu |
![]() Adam: A Method for Stochastic OptimizationInternational Conference on Learning Representations (ICLR), 2014 Diederik P. Kingma Jimmy Ba |