Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.02337
Cited By
Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network
7 June 2017
Xiao Yang
Ersin Yumer
P. Asente
Mike Kraley
Daniel Kifer
C. Lee Giles
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network"
47 / 47 papers shown
Title
DocMamba: Efficient Document Pre-training with State Space Model
Pengfei Hu
Zhenrong Zhang
Jiefeng Ma
Shuhang Liu
Jun Du
Jianshu Zhang
Mamba
47
1
0
18 Sep 2024
Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images
Tahira Shehzadi
K. Hashmi
D. Stricker
Marcus Liwicki
Muhammad Zeshan Afzal
29
7
0
23 Jun 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
10
6
0
11 May 2023
Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding
Bhanu Prakash Voutharoja
Lizhen Qu
Fatemeh Shiri
30
1
0
08 May 2023
Entry Separation using a Mixed Visual and Textual Language Model: Application to 19th century French Trade Directories
Bertrand Duménieu
Edwin Carlinet
N. Abadie
Joseph Chazalon
29
0
0
17 Feb 2023
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Qiming Peng
Yinxu Pan
Wenjin Wang
Bin Luo
Zhenyu Zhang
...
Shi Feng
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
13
83
0
12 Oct 2022
ERNIE-mmLayout: Multi-grained MultiModal Transformer for Document Understanding
Wenjin Wang
Zhengjie Huang
Bin Luo
Qianglong Chen
Qiming Peng
...
Weichong Yin
Shi Feng
Yu Sun
Dianhai Yu
Yin Zhang
ViT
35
11
0
18 Sep 2022
One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text
Abhinav Java
Shripad Deshmukh
Milan Aggarwal
Surgan Jandial
Mausoom Sarkar
Balaji Krishnamurthy
37
3
0
12 Sep 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding
Song Tao
Zijian Wang
Tiantian Fan
Canjie Luo
Can Huang
SSL
42
2
0
28 Jul 2022
VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marccal Rusinol
O. R. Terrades
VLM
56
30
0
24 May 2022
Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods
Mélodie Boillet
Christopher Kermorvant
Thierry Paquet
AI4TS
21
15
0
23 Mar 2022
DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition
Denis Coquenet
Clément Chatelain
Thierry Paquet
36
57
0
23 Mar 2022
Unified Line and Paragraph Detection by Graph Convolutional Networks
Shuang Liu
Renshen Wang
Michalis Raptis
Yasuhisa Fujii
GNN
32
7
0
17 Mar 2022
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation
Zheng Lian
Lang Chen
Guoying Zhao
B. Liu
J. Tao
30
85
0
04 Mar 2022
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
Jiapeng Wang
Lianwen Jin
Kai Ding
VLM
35
140
0
28 Feb 2022
WebFormer: The Web-page Transformer for Structure Information Extraction
Qifan Wang
Yi Fang
Anirudh Ravula
Fuli Feng
Xiaojun Quan
Dongfang Liu
ViT
149
65
0
01 Feb 2022
Document Layout Analysis with Aesthetic-Guided Image Augmentation
Tianlong Ma
Xingjiao Wu
Xin Li
Xiangcheng Du
Zhao Zhou
Liang Xue
Cheng Jin
29
2
0
27 Nov 2021
Document AI: Benchmarks, Models and Applications
Lei Cui
Yiheng Xu
Tengchao Lv
Furu Wei
VLM
29
70
0
16 Nov 2021
Synthetic Document Generator for Annotation-free Layout Recognition
Natraj Raman
Sameena Shah
Manuela Veloso
42
10
0
11 Nov 2021
Accurate Fine-grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation
Penghai Zhao
Weilan Wang
Zhengqi Cai
Guowei Zhang
Yuqi Lu
25
7
0
15 Oct 2021
Using Neighborhood Context to Improve Information Extraction from Visual Documents Captured on Mobile Phones
Kalpa Gunaratna
Vijay Srinivasan
Sandeep Nama
Hongxia Jin
32
5
0
23 Aug 2021
Multi-Modal Association based Grouping for Form Structure Extraction
Milan Aggarwal
Mausoom Sarkar
Hiresh Gupta
Balaji Krishnamurthy
19
10
0
09 Jul 2021
SelfDoc: Self-Supervised Document Representation Learning
Peizhao Li
Jiuxiang Gu
Jason Kuen
Vlad I. Morariu
Handong Zhao
R. Jain
Varun Manjunatha
Hongfu Liu
ViT
SSL
28
160
0
07 Jun 2021
End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net
Tuan-Anh Dang Nguyen
Dat Nguyen Thanh
13
16
0
02 Jun 2021
ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents
Weihong Lin
Qifang Gao
Lei-huan Sun
Zhuoyao Zhong
Kaiqin Hu
Qin Ren
Qiang Huo
31
39
0
25 May 2021
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
Peng Zhang
Can Li
Liang Qiao
Zhanzhan Cheng
Shiliang Pu
Yi Niu
Fei Wu
31
57
0
13 May 2021
LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding
Te-Lin Wu
Cheng-rong Li
Mingyang Zhang
Tao Chen
Spurthi Amba Hombaiah
Michael Bendersky
21
14
0
16 Apr 2021
Page Layout Analysis System for Unconstrained Historic Documents
O. Kodym
Michal Hradiš
16
22
0
23 Feb 2021
Post-OCR Paragraph Recognition by Graph Convolutional Networks
Renshen Wang
Yasuhisa Fujii
Ashok Popat
GNN
39
20
0
29 Jan 2021
Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks
Mélodie Boillet
Christopher Kermorvant
Thierry Paquet
11
22
0
28 Dec 2020
docExtractor: An off-the-shelf historical document element extraction
Tom Monnier
Mathieu Aubry
VLM
27
28
0
15 Dec 2020
Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks
Huichen Yang
W. Hsu
19
12
0
18 Oct 2020
Table Structure Recognition using Top-Down and Bottom-Up Cues
S. Raja
Ajoy Mondal
C. V. Jawahar
LMTD
27
76
0
09 Oct 2020
VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach
Mohamed Kerroumi
Othmane Sayem
A. Shabou
27
21
0
05 Oct 2020
Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning
Subhojeet Pramanik
Shashank Mujumdar
Hima Patel
21
31
0
30 Sep 2020
Abstractive Information Extraction from Scanned Invoices (AIESI) using End-to-end Sequential Approach
Shreeshiv Patel
Dvijesh N Bhatt
30
11
0
12 Sep 2020
LayoutTransformer: Layout Generation and Completion with Self-attention
Kamal Gupta
Justin Lazarow
Alessandro Achille
Larry S. Davis
Vijay Mahadevan
Abhinav Shrivastava
ViT
39
136
0
25 Jun 2020
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
71
686
0
31 Dec 2019
Fine-Grained Object Detection over Scientific Document Images with Region Embeddings
Ankur Goswami
Joshua McGrath
S. Peters
Theodoros Rekatsinas
ObjD
24
3
0
28 Oct 2019
Multimodal deep networks for text and image-based document classification
Nicolas Audebert
Catherine Herold
K. Slimani
Cédric Vidal
19
97
0
15 Jul 2019
Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization
Paul Pu Liang
Zhun Liu
Yao-Hung Hubert Tsai
Qibin Zhao
Ruslan Salakhutdinov
Louis-Philippe Morency
AI4TS
30
81
0
01 Jul 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
143
357
0
27 May 2019
LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators
Jianan Li
Jimei Yang
Aaron Hertzmann
Jianming Zhang
Tingfa Xu
GAN
21
226
0
21 Jan 2019
Attend, Copy, Parse -- End-to-end information extraction from documents
Rasmus Berg Palm
Florian Laws
Ole Winther
19
58
0
18 Dec 2018
Chargrid: Towards Understanding 2D Documents
Anoop R. Katti
C. Reisswig
Cordula Guder
Sebastian Brarda
S. Bickel
Johannes Höhne
Jean Baptiste Faddoul
26
194
0
24 Sep 2018
Extracting Scientific Figures with Distantly Supervised Neural Networks
Noah Y. Siegel
Nicholas Lourie
Russell Power
Bridger Waleed Ammar
13
114
0
06 Apr 2018
Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images
C. Wick
F. Puppe
SSeg
31
71
0
21 Nov 2017
1