Inherently Interpretable Sparse Word Embeddings through Sparse Coding

8 April 2020

Abstract

Word embeddings are a powerful natural language processing technique, but they are extremely difficult to interpret. In order to create more interpretable word embeddings, we transform pretrained dense word embeddings into sparse embeddings. These new embeddings are inherently interpretable: each of their dimensions are created from and represent a natural language word or specific syntactic concept. We construct these embeddings through sparse coding, where each vector in the basis set is itself a word embedding. We show that models trained using these sparse embeddings can achieve good performance and are extremely interpretable.

View on arXiv

Comments on this paper