Constructing a Word Similarity Graph from Vector based Word Representation for Named Entity Recognition

International Conference on Web Information Systems and Technologies (WEBIST), 2018

9 July 2018

Abstract

In this paper, we discuss a method for identifying a seed word that would best represent a class of named entities in a graphical representation of words and their similarities. Word networks, or word graphs, are representations of vectorized text where nodes are the words encountered in a corpus, and the weighted edges incident on the nodes represent how similar the words are to each other. We intend to build a bilingual word graph and identify seed words through community analysis that would be best used to segment a graph according to its named entities, therefore providing an unsupervised way of tagging named entities for a bilingual language base.

View on arXiv

Comments on this paper