BERT for Long Documents: A Case Study of Automated ICD Coding
Arash Afkanpour
Shabir Adeel
H. Bassani
Arkady Epshteyn
Hongbo Fan
Isaac Jones
Mahan Malihi
Adrian Nauth
Raj Sinha
Sanjana Woonna
S. Zamani
Elli Kanal
M. Fomitchev
Donny Cheung

Abstract
Transformer models have achieved great success across many NLP problems. However, previous studies in automated ICD coding concluded that these models fail to outperform some of the earlier solutions such as CNN-based models. In this paper we challenge this conclusion. We present a simple and scalable method to process long text with the existing transformer models such as BERT. We show that this method significantly improves the previous results reported for transformer models in ICD coding, and is able to outperform one of the prominent CNN-based methods.
View on arXivComments on this paper