ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.13318
  4. Cited By
LayoutLM: Pre-training of Text and Layout for Document Image
  Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

31 December 2019
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
ArXivPDFHTML

Papers citing "LayoutLM: Pre-training of Text and Layout for Document Image Understanding"

50 / 371 papers shown
Title
Long-Range Transformer Architectures for Document Understanding
Long-Range Transformer Architectures for Document Understanding
Thibault Douzon
S. Duffner
Christophe Garcia
Jérémy Espinas
VLM
34
2
0
11 Sep 2023
Improving Information Extraction on Business Documents with Specific
  Pre-Training Tasks
Improving Information Extraction on Business Documents with Specific Pre-Training Tasks
Thibault Douzon
S. Duffner
Christophe Garcia
Jérémy Espinas
27
6
0
11 Sep 2023
Attention Where It Matters: Rethinking Visual Document Understanding
  with Selective Region Concentration
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
H. Cao
Changcun Bao
Chaohu Liu
Huang-wei Chen
Kun Yin
Hao Liu
Yinsong Liu
Deqiang Jiang
Xing Sun
28
13
0
03 Sep 2023
DTrOCR: Decoder-only Transformer for Optical Character Recognition
DTrOCR: Decoder-only Transformer for Optical Character Recognition
Masato Fujitake
64
35
0
30 Aug 2023
Document AI: A Comparative Study of Transformer-Based, Graph-Based
  Models, and Convolutional Neural Networks For Document Layout Analysis
Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis
Sotirios Kastanas
Shaomu Tan
Yijiang He
41
1
0
29 Aug 2023
Vision Grid Transformer for Document Layout Analysis
Vision Grid Transformer for Document Layout Analysis
Cheng Da
Chuwei Luo
Qi Zheng
Cong Yao
ViT
45
29
0
29 Aug 2023
High-Resolution Document Shadow Removal via A Large-Scale Real-World
  Dataset and A Frequency-Aware Shadow Erasing Net
High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net
Zinuo Li
Xuhang Chen
Chi-Man Pun
Xiaodong Cun
37
35
0
27 Aug 2023
Nougat: Neural Optical Understanding for Academic Documents
Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher
Guillem Cucurull
Thomas Scialom
Robert Stojnic
ViT
27
109
0
25 Aug 2023
DocPrompt: Large-scale continue pretrain for zero-shot and few-shot
  document question answering
DocPrompt: Large-scale continue pretrain for zero-shot and few-shot document question answering
Sijin Wu
Dan Zhang
Teng Hu
Shikun Feng
35
1
0
21 Aug 2023
Enhancing Visually-Rich Document Understanding via Layout Structure
  Modeling
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling
Qiwei Li
Z. Li
Xiantao Cai
Bo Du
Hai Zhao
28
7
0
15 Aug 2023
SciGraphQA: A Large-Scale Synthetic Multi-Turn Question-Answering
  Dataset for Scientific Graphs
SciGraphQA: A Large-Scale Synthetic Multi-Turn Question-Answering Dataset for Scientific Graphs
Sheng Li
Nima Tajbakhsh
MLLM
21
48
0
07 Aug 2023
A Graphical Approach to Document Layout Analysis
A Graphical Approach to Document Layout Analysis
Jilin Wang
Michael Krumdick
Baojia Tong
Hamima Halim
M. Sokolov
Vadym Barda
Delphine Vendryes
Christy Tanner
24
8
0
03 Aug 2023
RealCQA: Scientific Chart Question Answering as a Test-bed for
  First-Order Logic
RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic
Saleem Ahmed
Bhavin Jawade
Shubham Pandey
S. Setlur
Venugopal Govindaraju
23
5
0
03 Aug 2023
SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart
  Understanding
SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart Understanding
Saleem Ahmed
Pengyu Yan
David Doermann
S. Setlur
Venugopal Govindaraju
26
2
0
03 Aug 2023
A Real-World WebAgent with Planning, Long Context Understanding, and
  Program Synthesis
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis
Izzeddin Gur
Hiroki Furuta
Austin Huang
Mustafa Safdari
Yutaka Matsuo
Douglas Eck
Aleksandra Faust
LM&Ro
LLMAG
39
201
0
24 Jul 2023
Multimodal Document Analytics for Banking Process Automation
Multimodal Document Analytics for Banking Process Automation
C. Gerling
Stefan Lessmann
36
3
0
21 Jul 2023
DocTr: Document Transformer for Structured Information Extraction in
  Documents
DocTr: Document Transformer for Structured Information Extraction in Documents
Haofu Liao
Aruni RoyChowdhury
Weijian Li
Ankan Bansal
Yuting Zhang
Zhuowen Tu
R. Satzoda
R. Manmatha
Vijay Mahadevan
29
11
0
16 Jul 2023
A Survey on Change Detection Techniques in Document Images
A Survey on Change Detection Techniques in Document Images
Abhinandan Kumar Pun
M. Javed
David Doermann
16
0
0
15 Jul 2023
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document
  Understanding
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Jiabo Ye
Anwen Hu
Haiyang Xu
Qinghao Ye
Mingshi Yan
...
Chenliang Li
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
VLM
MLLM
27
118
0
04 Jul 2023
On Evaluation of Document Classification using RVL-CDIP
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
39
3
0
21 Jun 2023
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text
  Removal with SegMIM Pretraining
ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining
Dezhi Peng
Chongyu Liu
Yuliang Liu
Lianwen Jin
DiffM
27
9
0
21 Jun 2023
GenPlot: Increasing the Scale and Diversity of Chart Derendering Data
GenPlot: Increasing the Scale and Diversity of Chart Derendering Data
Brendan Artley
23
1
0
20 Jun 2023
DocumentNet: Bridging the Data Gap in Document Pre-Training
DocumentNet: Bridging the Data Gap in Document Pre-Training
Lijun Yu
Jin Miao
Xiaoyu Sun
Jiayi Chen
Alexander G. Hauptmann
H. Dai
Wei Wei
24
3
0
15 Jun 2023
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich
  Document Images
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images
Wenwen Yu
Chengquan Zhang
H. Cao
Wei Hua
Bohan Li
...
Hao Fei
Dimosthenis Karatzas
Xingchao Sun
Jingdong Wang
Xiang Bai
34
11
0
05 Jun 2023
Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual
  Document Understanding Models
Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models
Jiabang He
Yilang Hu
Lei Wang
Xingdong Xu
Ning Liu
Hui-juan Liu
Hengtao Shen
VLM
OOD
24
2
0
05 Jun 2023
DocFormerv2: Local Features for Document Understanding
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
36
39
0
02 Jun 2023
Are Layout-Infused Language Models Robust to Layout Distribution Shifts?
  A Case Study with Scientific Documents
Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents
Catherine Chen
Zejiang Shen
Dan Klein
Gabriel Stanovsky
Doug Downey
Kyle Lo
32
2
0
01 Jun 2023
End-to-End Document Classification and Key Information Extraction using
  Assignment Optimization
End-to-End Document Classification and Key Information Extraction using Assignment Optimization
Ciaran Cooney
Joana Cavadas
Liam Madigan
Bradley Savage
Rachel Heyburn
Mairead O'Cuinn
11
0
0
01 Jun 2023
Layout and Task Aware Instruction Prompt for Zero-shot Document Image
  Question Answering
Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering
Wenjin Wang
Yunhao Li
Yixin Ou
Yin Zhang
VLM
29
24
0
01 Jun 2023
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training
  for Document Understanding
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
Yi Tu
Ya Guo
Huan Chen
Jinyang Tang
31
15
0
30 May 2023
Alfred: A System for Prompted Weak Supervision
Alfred: A System for Prompted Weak Supervision
Peilin Yu
Stephen H. Bach
19
7
0
29 May 2023
Visually-Situated Natural Language Understanding with Contrastive
  Reading Model and Frozen Large Language Models
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Geewook Kim
Hodong Lee
D. Kim
Haeji Jung
S. Park
Yoon Kim
Sangdoo Yun
Taeho Kil
Bado Lee
Seunghyun Park
VLM
48
4
0
24 May 2023
Towards Few-shot Entity Recognition in Document Images: A Graph Neural
  Network Approach Robust to Image Manipulation
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation
Prashant Krishnan
Zilong Wang
Yangkun Wang
Jingbo Shang
23
3
0
24 May 2023
UniChart: A Universal Vision-language Pretrained Model for Chart
  Comprehension and Reasoning
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Ahmed Masry
P. Kavehzadeh
Do Xuan Long
Enamul Hoque
Chenyu You
LRM
27
100
0
24 May 2023
RE$^2$: Region-Aware Relation Extraction from Visually Rich Documents
RE2^22: Region-Aware Relation Extraction from Visually Rich Documents
Pritika Ramu
Sijia Wang
Lalla Mouatadid
Joy Rimchala
Lifu Huang
38
0
0
24 May 2023
DUBLIN -- Document Understanding By Language-Image Network
DUBLIN -- Document Understanding By Language-Image Network
Kriti Aggarwal
Aditi Khandelwal
Kumar Tanmay
Owais Mohammed Khan
Qiang Liu
Monojit Choudhury
Hardik Hansrajbhai Chauhan
Subhojit Som
Vishrav Chaudhary
Saurabh Tiwary
ObjD
VLM
55
0
0
23 May 2023
Global Structure Knowledge-Guided Relation Extraction Method for
  Visually-Rich Document
Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document
Xiangnan Chen
Qianwen Xiao
Juncheng Li
Duo Dong
Jun Lin
Xiaozhong Liu
Siliang Tang
34
5
0
23 May 2023
Detecting automatically the layout of clinical documents to enhance the
  performances of downstream natural language processing
Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing
C. Gérardin
Perceval Wajsburt
Basile Dura
Alice Calliger
Alexandre Mouchet
X. Tannier
R. Bey
21
1
0
23 May 2023
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal
  Approach with Relative XML Path
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path
Zilong Wang
Jingbo Shang
49
0
0
23 May 2023
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Hiroki Furuta
Kuang-Huei Lee
Ofir Nachum
Yutaka Matsuo
Aleksandra Faust
S. Gu
Izzeddin Gur
LM&Ro
36
93
0
19 May 2023
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided
  Dynamic Token Merge for Document Understanding
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Mingliang Zhai
Yulin Li
Xiameng Qin
Chen Yi
Qunyi Xie
Chengquan Zhang
Kun Yao
Yuwei Wu
Yunde Jia
35
8
0
19 May 2023
Sequence-to-Sequence Pre-training with Unified Modality Masking for
  Visual Document Understanding
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding
ShuWei Feng
Tianyang Zhan
Zhanming Jie
Trung Quoc Luong
Xiaoran Jin
27
1
0
16 May 2023
Visual Information Extraction in the Wild: Practical Dataset and
  End-to-end Solution
Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution
Jianfeng Kuang
Wei Hua
Dingkang Liang
Mingkun Yang
Deqiang Jiang
Bo Ren
Xiang Bai
29
39
0
12 May 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document
  Image Classification
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
10
6
0
11 May 2023
Language Independent Neuro-Symbolic Semantic Parsing for Form
  Understanding
Language Independent Neuro-Symbolic Semantic Parsing for Form Understanding
Bhanu Prakash Voutharoja
Lizhen Qu
Fatemeh Shiri
30
1
0
08 May 2023
Text Reading Order in Uncontrolled Conditions by Sparse Graph
  Segmentation
Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation
Renshen Wang
Yasuhisa Fujii
Alessandro Bissacco
GNN
28
6
0
04 May 2023
FormNetV2: Multimodal Graph Contrastive Learning for Form Document
  Information Extraction
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Nils Loose
Chun-Liang Li
Hao Zhang
Timothy Dozat
Felix Mächtle
...
Shangbang Long
Siyang Qin
Yasuhisa Fujii
Nan Hua
T. Eisenbarth
SSL
48
19
0
04 May 2023
Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text
  Documents via Semantic-Oriented Hierarchical Graphs
Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text Documents via Semantic-Oriented Hierarchical Graphs
Fengbin Zhu
Chao Wang
Fuli Feng
Zifeng Ren
Moxin Li
Tat-Seng Chua
47
3
0
03 May 2023
CCpdf: Building a High Quality Corpus for Visually Rich Documents from
  Web Crawl Data
CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data
M. Turski
Tomasz Stanislawek
Karol Kaczmarek
Pawel Dyda
Filip Graliñski
33
12
0
28 Apr 2023
Information Redundancy and Biases in Public Document Information
  Extraction Benchmarks
Information Redundancy and Biases in Public Document Information Extraction Benchmarks
S. Laatiri
Pirashanth Ratnamogan
Joel Tang
Laurent Lam
William Vanhuffel
Fabien Caspani
33
1
0
28 Apr 2023
Previous
12345678
Next