Cohort of LSTM and lexicon verification for handwriting recognition with
gigantic lexicon
- RALM
Handwriting recognition state of the art methods are based on Long Short Term Memory (LSTM) recurrent neural networks (RNN) coupled with the use of linguistic knowledge. LSTM RNN presents high raw performance and interesting training properties that allow us to break with the standard method at the state of the art. We present a simple and efficient way to extract from a single training a large number of complementary LSTM RNN, called cohort, combined in a cascade architecture with a lexical verification. This process does not require fine tuning, making it easy to use. Our verification allow to deal quickly and efficiently with gigantic lexicon (over 3 million words). We achieve state of the art results for isolated word recognition with very large lexicon and present novel results for an unprecedented gigantic lexicon.
View on arXiv