Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities

North American Chapter of the Association for Computational Linguistics (NAACL), 2019

1 September 2019

ArXiv (abs)PDF HTML Github (724★)

Abstract

We propose a new global entity disambiguation (ED) model based on contextualized embeddings of words and entities. Our model is based on BERT and trained with our new training task, which enables the model to capture both the word-based local and entity-based global contextual information. The model solves ED as a sequence decision task and effectively uses both types of contextual information. We achieve new state-of-the-art results on five standard ED datasets: AIDA-CoNLL, MSNBC, AQUAINT, ACE2004, and WNED-WIKI. Our source code and trained model checkpoint are available at https://github.com/studio-ousia/luke.

View on arXiv

Comments on this paper