Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network

7 June 2017

Papers citing "Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network"

47 / 47 papers shown

Title
DocMamba: Efficient Document Pre-training with State Space Model Pengfei Hu Zhenrong Zhang Jiefeng Ma Shuhang Liu Jun Du Jianshu Zhang Mamba 47 1 0 18 Sep 2024
Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images Tahira Shehzadi K. Hashmi D. Stricker Marcus Liwicki Muhammad Zeshan Afzal 29 7 0 23 Jun 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification Souhail Bakkali Zuheng Ming Mickael Coustaty Marçal Rusiñol 10 6 0 11 May 2023
Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding Bhanu Prakash Voutharoja Lizhen Qu Fatemeh Shiri 30 1 0 08 May 2023
Entry Separation using a Mixed Visual and Textual Language Model: Application to 19th century French Trade Directories Bertrand Duménieu Edwin Carlinet N. Abadie Joseph Chazalon 29 0 0 17 Feb 2023
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding Qiming Peng Yinxu Pan Wenjin Wang Bin Luo Zhenyu Zhang ... Shi Feng Yu Sun Hao Tian Hua Wu Haifeng Wang 13 83 0 12 Oct 2022
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding Wenjin Wang Zhengjie Huang Bin Luo Qianglong Chen Qiming Peng ... Weichong Yin Shi Feng Yu Sun Dianhai Yu Yin Zhang ViT 35 11 0 18 Sep 2022
One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text Abhinav Java Shripad Deshmukh Milan Aggarwal Surgan Jandial Mausoom Sarkar Balaji Krishnamurthy 37 3 0 12 Sep 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding Song Tao Zijian Wang Tiantian Fan Canjie Luo Can Huang SSL 42 2 0 28 Jul 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification Souhail Bakkali Zuheng Ming Mickael Coustaty Marccal Rusinol O. R. Terrades VLM 56 30 0 24 May 2022
Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods Mélodie Boillet Christopher Kermorvant Thierry Paquet AI4TS 21 15 0 23 Mar 2022
DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition Denis Coquenet Clément Chatelain Thierry Paquet 36 57 0 23 Mar 2022
Unified Line and Paragraph Detection by Graph Convolutional Networks Shuang Liu Renshen Wang Michalis Raptis Yasuhisa Fujii GNN 32 7 0 17 Mar 2022
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation Zheng Lian Lang Chen Guoying Zhao B. Liu J. Tao 30 85 0 04 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding Jiapeng Wang Lianwen Jin Kai Ding VLM 35 140 0 28 Feb 2022
WebFormer: The Web-page Transformer for Structure Information Extraction Qifan Wang Yi Fang Anirudh Ravula Fuli Feng Xiaojun Quan Dongfang Liu ViT 149 65 0 01 Feb 2022
Document Layout Analysis with Aesthetic-Guided Image Augmentation Tianlong Ma Xingjiao Wu Xin Li Xiangcheng Du Zhao Zhou Liang Xue Cheng Jin 29 2 0 27 Nov 2021
Document AI: Benchmarks, Models and Applications Lei Cui Yiheng Xu Tengchao Lv Furu Wei VLM 29 70 0 16 Nov 2021
Synthetic Document Generator for Annotation-free Layout Recognition Natraj Raman Sameena Shah Manuela Veloso 42 10 0 11 Nov 2021
Accurate Fine-grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation Penghai Zhao Weilan Wang Zhengqi Cai Guowei Zhang Yuqi Lu 25 7 0 15 Oct 2021
Using Neighborhood Context to Improve Information Extraction from Visual Documents Captured on Mobile Phones Kalpa Gunaratna Vijay Srinivasan Sandeep Nama Hongxia Jin 32 5 0 23 Aug 2021
Multi-Modal Association based Grouping for Form Structure Extraction Milan Aggarwal Mausoom Sarkar Hiresh Gupta Balaji Krishnamurthy 19 10 0 09 Jul 2021
SelfDoc: Self-Supervised Document Representation Learning Peizhao Li Jiuxiang Gu Jason Kuen Vlad I. Morariu Handong Zhao R. Jain Varun Manjunatha Hongfu Liu ViT SSL 28 160 0 07 Jun 2021
End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net Tuan-Anh Dang Nguyen Dat Nguyen Thanh 13 16 0 02 Jun 2021
ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents Weihong Lin Qifang Gao Lei-huan Sun Zhuoyao Zhong Kaiqin Hu Qin Ren Qiang Huo 31 39 0 25 May 2021
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Peng Zhang Can Li Liang Qiao Zhanzhan Cheng Shiliang Pu Yi Niu Fei Wu 31 57 0 13 May 2021
LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding Te-Lin Wu Cheng-rong Li Mingyang Zhang Tao Chen Spurthi Amba Hombaiah Michael Bendersky 21 14 0 16 Apr 2021
Page Layout Analysis System for Unconstrained Historic Documents O. Kodym Michal Hradiš 16 22 0 23 Feb 2021
Post-OCR Paragraph Recognition by Graph Convolutional Networks Renshen Wang Yasuhisa Fujii Ashok Popat GNN 39 20 0 29 Jan 2021
Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks Mélodie Boillet Christopher Kermorvant Thierry Paquet 11 22 0 28 Dec 2020
docExtractor: An off-the-shelf historical document element extraction Tom Monnier Mathieu Aubry VLM 27 28 0 15 Dec 2020
Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks Huichen Yang W. Hsu 19 12 0 18 Oct 2020
Table Structure Recognition using Top-Down and Bottom-Up Cues S. Raja Ajoy Mondal C. V. Jawahar LMTD 27 76 0 09 Oct 2020
VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach Mohamed Kerroumi Othmane Sayem A. Shabou 27 21 0 05 Oct 2020
Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning Subhojeet Pramanik Shashank Mujumdar Hima Patel 21 31 0 30 Sep 2020
Abstractive Information Extraction from Scanned Invoices (AIESI) using End-to-end Sequential Approach Shreeshiv Patel Dvijesh N Bhatt 30 11 0 12 Sep 2020
LayoutTransformer: Layout Generation and Completion with Self-attention Kamal Gupta Justin Lazarow Alessandro Achille Larry S. Davis Vijay Mahadevan Abhinav Shrivastava ViT 39 136 0 25 Jun 2020
LayoutLM: Pre-training of Text and Layout for Document Image Understanding Yiheng Xu Minghao Li Lei Cui Shaohan Huang Furu Wei Ming Zhou 71 686 0 31 Dec 2019
Fine-Grained Object Detection over Scientific Document Images with Region Embeddings Ankur Goswami Joshua McGrath S. Peters Theodoros Rekatsinas ObjD 24 3 0 28 Oct 2019
Multimodal deep networks for text and image-based document classification Nicolas Audebert Catherine Herold K. Slimani Cédric Vidal 19 97 0 15 Jul 2019
Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization Paul Pu Liang Zhun Liu Yao-Hung Hubert Tsai Qibin Zhao Ruslan Salakhutdinov Louis-Philippe Morency AI4TS 30 81 0 01 Jul 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents Guillaume Jaume H. K. Ekenel Jean-Philippe Thiran 143 357 0 27 May 2019
LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators Jianan Li Jimei Yang Aaron Hertzmann Jianming Zhang Tingfa Xu GAN 21 226 0 21 Jan 2019
Attend, Copy, Parse -- End-to-end information extraction from documents Rasmus Berg Palm Florian Laws Ole Winther 19 58 0 18 Dec 2018
Chargrid: Towards Understanding 2D Documents Anoop R. Katti C. Reisswig Cordula Guder Sebastian Brarda S. Bickel Johannes Höhne Jean Baptiste Faddoul 26 194 0 24 Sep 2018
Extracting Scientific Figures with Distantly Supervised Neural Networks Noah Y. Siegel Nicholas Lourie Russell Power Bridger Waleed Ammar 13 114 0 06 Apr 2018
Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images C. Wick F. Puppe SSeg 31 71 0 21 Nov 2017