221

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Abstract

Low resource speech recognition has been long-suffering from insufficient training data. While neighbour languages are often used as assistant training data, it would be difficult for the model to induct similar units (character, subword, etc.) across the languages. In this paper, we assume similar units in neighbour language share similar term frequency and form a Huffman tree to perform multi-lingual hierarchical Softmax decoding. During decoding, the hierarchical structure can benefit the training of low-resource languages. Experimental results show the effectiveness of our method.

View on arXiv
Comments on this paper