Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

8 April 2022

Qianying Liu

Chenchen Ding

Abstract

Low resource speech recognition has been long-suffering from insufficient training data. While neighbour languages are often used as assistant training data, it would be difficult for the model to induct similar units (character, subword, etc.) across the languages. In this paper, we assume similar units in neighbour language share similar term frequency and form a Huffman tree to perform multi-lingual hierarchical Softmax decoding. During decoding, the hierarchical structure can benefit the training of low-resource languages. Experimental results show the effectiveness of our method.

View on arXiv

Comments on this paper