9
3

Graph-based Deep Generative Modelling for Document Layout Generation

Sanket Biswas
Pau Riba
Josep Lladós
Umapada Pal
Abstract

One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices.

View on arXiv
Comments on this paper