Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq
Oleksii Kuchaiev
Boris Ginsburg
Igor Gitman
Vitaly Lavrukhin
Jason Chun Lok Li
Huyen Nguyen
Carl Case
Paulius Micikevicius

Abstract
We present OpenSeq2Seq - a TensorFlow-based toolkit for training sequence-to-sequence models that features distributed and mixed-precision training. Benchmarks on machine translation and speech recognition tasks show that models built using OpenSeq2Seq give state-of-the-art performance at 1.5-3x less training time. OpenSeq2Seq currently provides building blocks for models that solve a wide range of tasks including neural machine translation, automatic speech recognition, and speech synthesis.
View on arXivComments on this paper