79

Neural networks based EEG-Speech Models

Abstract

In this paper, we describe three neural network (NN) based EEG-Speech (NES) models that map the unspoken EEG signals to the corresponding phonemes. Instead of using conventional feature extraction techniques, the proposed NES models rely on graphic learning to project both EEG and speech signals into deep representation feature spaces. This NN based linear projection helps to realize multimodal data fusion (i.e., EEG and acoustic signals). It is convenient to construct the mapping between unspoken EEG signals and phonemes. Specifically, among three NES models, two augmented models (i.e., IANES-B and IANES-G) include spoken EEG signals as either bias or gate information to strengthen the feature learning and translation of unspoken EEG signals. A combined unsupervised and supervised training is implemented stepwise to learn the mapping for all three NES models. To enhance the computational performance, three way factored NN training technique is applied to IANES-G model. Unlike many existing methods, our augmented NES models incorporate spoken-EEG signals that can efficiently suppress the artifacts in unspoken-EEG signals. Experimental results reveal that all three proposed NES models outperform the baseline SVM method, whereas IANES-G demonstrates the best performance on speech recovery and classification task comparatively.

View on arXiv
Comments on this paper