ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.13318
  4. Cited By
LayoutLM: Pre-training of Text and Layout for Document Image
  Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

31 December 2019
Yiheng Xu
Minghao Li
Lei Cui
Shaohan Huang
Furu Wei
Ming Zhou
ArXivPDFHTML

Papers citing "LayoutLM: Pre-training of Text and Layout for Document Image Understanding"

50 / 371 papers shown
Title
ViRED: Prediction of Visual Relations in Engineering Drawings
ViRED: Prediction of Visual Relations in Engineering Drawings
Chao Gu
Ke Lin
Yiyang Luo
Jiahui Hou
Xiang-Yang Li
35
0
0
02 Sep 2024
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable
  Transcripts
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts
I. de Rodrigo
A. Sanchez-Cuadrado
J. Boal
A. J. Lopez-Lopez
VLM
28
1
0
31 Aug 2024
ChartEye: A Deep Learning Framework for Chart Information Extraction
ChartEye: A Deep Learning Framework for Chart Information Extraction
Osama Mustafa
Muhammad Khizer Ali
Momina Moetesum
Imran Siddiqi
GNN
27
1
0
28 Aug 2024
μgat: Improving Single-Page Document Parsing by Providing Multi-Page
  Context
μgat: Improving Single-Page Document Parsing by Providing Multi-Page Context
Fabio Quattrini
Carmine Zaccagnino
Silvia Cascianelli
Laura Righi
Rita Cucchiara
44
1
0
28 Aug 2024
SynthDoc: Bilingual Documents Synthesis for Visual Document
  Understanding
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding
Chuanghao Ding
Xuejing Liu
Wei Tang
Juan Li
Xiaoliang Wang
Rui Zhao
Cam-Tu Nguyen
Fei Tan
33
0
0
27 Aug 2024
DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding
DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding
Wenhui Liao
Jiapeng Wang
Hongliang Li
Chengyu Wang
Jun Huang
Lianwen Jin
48
0
0
27 Aug 2024
Large Language Models for Page Stream Segmentation
Large Language Models for Page Stream Segmentation
H. Heidenreich
Ratish Dalvi
Rohith Mukku
Nikhil Verma
Neven Pičuljan
35
0
0
21 Aug 2024
Deep Learning based Visually Rich Document Content Understanding: A
  Survey
Deep Learning based Visually Rich Document Content Understanding: A Survey
Muhammad Ali
Jean Lee
Salman Khan
47
6
0
02 Aug 2024
UNER: A Unified Prediction Head for Named Entity Recognition in
  Visually-rich Documents
UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents
Yi Tu
Chong Zhang
Ya Guo
Huan Chen
Jinyang Tang
Huijia Zhu
Qi Zhang
51
3
0
02 Aug 2024
LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models
LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models
Ruiyi Zhang
Yufan Zhou
Jian Chen
Jiuxiang Gu
Changyou Chen
Tongfei Sun
VLM
41
6
0
27 Jul 2024
OfficeBench: Benchmarking Language Agents across Multiple Applications
  for Office Automation
OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Zilong Wang
Yuedong Cui
Li Zhong
Zimin Zhang
Da Yin
Bill Yuchen Lin
Jingbo Shang
64
4
0
26 Jul 2024
CRMSP: A Semi-supervised Approach for Key Information Extraction with
  Class-Rebalancing and Merged Semantic Pseudo-Labeling
CRMSP: A Semi-supervised Approach for Key Information Extraction with Class-Rebalancing and Merged Semantic Pseudo-Labeling
Qi Zhang
Yonghong Song
Pengcheng Guo
Yangyang Hui
43
0
0
19 Jul 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
35
3
0
17 Jul 2024
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data
Yufan Shen
Chuwei Luo
Zhaoqing Zhu
Yang Chen
Qi Zheng
Zhi Yu
Jiajun Bu
Cong Yao
50
2
0
17 Jul 2024
DANIEL: A fast Document Attention Network for Information Extraction and
  Labelling of handwritten documents
DANIEL: A fast Document Attention Network for Information Extraction and Labelling of handwritten documents
Thomas Constum
Pierrick Tranouez
Thierry Paquet
32
5
0
12 Jul 2024
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses
  from Diagram
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram
Ming-Liang Zhang
Zhong-Zhi Li
Fei Yin
Liang Lin
Cheng-Lin Liu
LRM
24
7
0
10 Jul 2024
VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document
  Information Extraction
VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction
Thanh-Dat Nguyen
Tung Do-Viet
Hung Nguyen-Duy
Tuan-Hai Luu
Hung Le
Bach Le
Patanamon
Thongtanunam
SyDa
39
1
0
09 Jul 2024
MindBench: A Comprehensive Benchmark for Mind Map Structure Recognition
  and Analysis
MindBench: A Comprehensive Benchmark for Mind Map Structure Recognition and Analysis
Lei Chen
Feng Yan
Yujie Zhong
Shaoxiang Chen
Zequn Jie
Lin Ma
46
3
0
03 Jul 2024
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
Jinghui Lu
Haiyang Yu
Yunhong Wang
Yongjie Ye
Jingqun Tang
...
Qi Liu
Hao Feng
Hairu Wang
Hao Liu
Can Huang
54
21
0
02 Jul 2024
MMLongBench-Doc: Benchmarking Long-context Document Understanding with
  Visualizations
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
Yubo Ma
Yuhang Zang
Liangyu Chen
Meiqi Chen
Yizhu Jiao
...
Liangming Pan
Yu-Gang Jiang
Jiaqi Wang
Yixin Cao
Aixin Sun
ELM
RALM
VLM
39
25
0
01 Jul 2024
DocKylin: A Large Multimodal Model for Visual Document Understanding
  with Efficient Visual Slimming
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming
Jiaxin Zhang
Wentao Yang
Songxuan Lai
Zecheng Xie
Lianwen Jin
39
15
0
27 Jun 2024
UQE: A Query Engine for Unstructured Databases
UQE: A Query Engine for Unstructured Databases
Hanjun Dai
B. Wang
Xingchen Wan
Bo Dai
Sherry Yang
Azade Nova
Pengcheng Yin
P. Phothilimthana
Charles Sutton
Dale Schuurmans
60
3
0
23 Jun 2024
On Efficient Language and Vision Assistants for Visually-Situated
  Natural Language Understanding: What Matters in Reading and Reasoning
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
Geewook Kim
Minjoon Seo
VLM
44
2
0
17 Jun 2024
SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction
  Benchmark in Form Understanding
SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding
Jiefeng Ma
Yan Wang
Chenyu Liu
Jun Du
Yu Hu
Zhenrong Zhang
Pengfei Hu
Qing Wang
Jianshu Zhang
38
0
0
13 Jun 2024
M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine
  Translation
M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation
Benjamin Hsu
Xiaoyu Liu
Huayang Li
Yoshinari Fujinuma
Maria Nadejde
Xing Niu
Yair Kittenplon
Ron Litman
R. Pappagari
52
4
0
12 Jun 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Jordy Van Landeghem
Subhajit Maity
Ayan Banerjee
Matthew Blaschko
Marie-Francine Moens
Josep Lladós
Sanket Biswas
52
2
0
12 Jun 2024
Reconstructing training data from document understanding models
Reconstructing training data from document understanding models
Jérémie Dentan
Arnaud Paran
A. Shabou
AAML
SyDa
54
1
0
05 Jun 2024
XFormParser: A Simple and Effective Multimodal Multilingual
  Semi-structured Form Parser
XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser
Xianfu Cheng
Hang Zhang
Jian Yang
Xiang Li
Weixiao Zhou
...
Fei Liu
Wei Zhang
Tao Sun
Tongliang Li
Zhoujun Li
54
2
0
27 May 2024
SEMv3: A Fast and Robust Approach to Table Separation Line Detection
SEMv3: A Fast and Robust Approach to Table Separation Line Detection
Chunxia Qin
Zhenrong Zhang
Pengfei Hu
Chenyu Liu
Jie Ma
Jun Du
LMTD
32
2
0
20 May 2024
Generative Artificial Intelligence: A Systematic Review and Applications
Generative Artificial Intelligence: A Systematic Review and Applications
S. S. Sengar
Affan Bin Hasan
Sanjay Kumar
Fiona Carroll
MedIm
38
52
0
17 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
35
13
0
16 May 2024
Lightweight Spatial Modeling for Combinatorial Information Extraction
  From Documents
Lightweight Spatial Modeling for Combinatorial Information Extraction From Documents
Yanfei Dong
Lambert Deng
Jiazheng Zhang
Xiaodong Yu
Ting Lin
Francesco Gelli
Soujanya Poria
W. Lee
40
0
0
08 May 2024
GeoContrastNet: Contrastive Key-Value Edge Learning for
  Language-Agnostic Document Understanding
GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding
Nil Biescas
Carlos Boned Riera
Josep Lladós
Sanket Biswas
42
1
0
06 May 2024
MedPromptExtract (Medical Data Extraction Tool): Anonymization and
  Hi-fidelity Automated data extraction using NLP and prompt engineering
MedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineering
Roomani Srivastava
Suraj Prasad
Lipika Bhat
Sarvesh Deshpande
Barnali Das
Kshitij Jadhav
MedIm
26
0
0
04 May 2024
CREPE: Coordinate-Aware End-to-End Document Parser
CREPE: Coordinate-Aware End-to-End Document Parser
Yamato Okamoto
Youngmin Baek
Geewook Kim
Ryota Nakao
Donghyun Kim
Moonbin Yim
Seunghyun Park
Bado Lee
35
1
0
01 May 2024
Reading Order Independent Metrics for Information Extraction in
  Handwritten Documents
Reading Order Independent Metrics for Information Extraction in Handwritten Documents
David Villanova-Aparisi
Solène Tarride
Carlos David Martínez Hinarejos
Verónica Romero
Christopher Kermorvant
Moisés Pastor
18
0
0
29 Apr 2024
A LayoutLMv3-Based Model for Enhanced Relation Extraction in
  Visually-Rich Documents
A LayoutLMv3-Based Model for Enhanced Relation Extraction in Visually-Rich Documents
Wiam Adnan
Joel Tang
Yassine Bel Khayat Zouggari
S. Laatiri
Laurent Lam
Fabien Caspani
34
0
0
16 Apr 2024
Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal
  Pre-Training Approach
Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach
Feihu Jiang
Chuan Qin
Jingshuai Zhang
Kaichun Yao
Xi Chen
Dazhong Shen
Chen Zhu
Hengshu Zhu
Hui Xiong
36
8
0
13 Apr 2024
HRVDA: High-Resolution Visual Document Assistant
HRVDA: High-Resolution Visual Document Assistant
Chaohu Liu
Kun Yin
Haoyu Cao
Xinghua Jiang
Xin Li
Yinsong Liu
Deqiang Jiang
Xing Sun
Linli Xu
VLM
45
24
0
10 Apr 2024
JSTR: Judgment Improves Scene Text Recognition
JSTR: Judgment Improves Scene Text Recognition
Masato Fujitake
64
1
0
09 Apr 2024
LayoutLLM: Layout Instruction Tuning with Large Language Models for
  Document Understanding
LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding
Chuwei Luo
Yufan Shen
Zhaoqing Zhu
Qi Zheng
Zhi Yu
Cong Yao
37
40
0
08 Apr 2024
Bidirectional Long-Range Parser for Sequential Data Understanding
Bidirectional Long-Range Parser for Sequential Data Understanding
George Leotescu
Daniel Voinea
A. Popa
50
1
0
08 Apr 2024
BuDDIE: A Business Document Dataset for Multi-task Information
  Extraction
BuDDIE: A Business Document Dataset for Multi-task Information Extraction
Ran Zmigrod
Dongsheng Wang
Mathieu Sibue
Yulong Pei
Petr Babkin
...
Antony Papadimitriou
William Watson
Zhiqiang Ma
Armineh Nourbakhsh
Sameena Shah
27
4
0
05 Apr 2024
DOCMASTER: A Unified Platform for Annotation, Training, & Inference in
  Document Question-Answering
DOCMASTER: A Unified Platform for Annotation, Training, & Inference in Document Question-Answering
Alex Nguyen
Zilong Wang
Jingbo Shang
Dheeraj Mekala
41
1
0
30 Mar 2024
ReALM: Reference Resolution As Language Modeling
ReALM: Reference Resolution As Language Modeling
Joel Ruben Antony Moniz
Soundarya Krishnan
Melis Ozyildirim
Prathamesh Saraf
Halim Cagri Ates
Yuan-kang Zhang
Hong-ye Yu
Nidhi Rajshree
47
6
0
29 Mar 2024
JDocQA: Japanese Document Question Answering Dataset for Generative
  Language Models
JDocQA: Japanese Document Question Answering Dataset for Generative Language Models
Eri Onami
Shuhei Kurita
Taiki Miyanishi
Taro Watanabe
27
1
0
28 Mar 2024
OmniParser: A Unified Framework for Text Spotting, Key Information
  Extraction and Table Recognition
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
53
28
0
28 Mar 2024
Visually Guided Generative Text-Layout Pre-training for Document
  Intelligence
Visually Guided Generative Text-Layout Pre-training for Document Intelligence
Zhiming Mao
Haoli Bai
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
Kam-Fai Wong
32
8
0
25 Mar 2024
Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators
  for Reasoning-Based Chart VQA
Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
Zhuowan Li
Bhavan A. Jasani
Peng Tang
Shabnam Ghadar
LRM
39
8
0
25 Mar 2024
Towards Human-Like Machine Comprehension: Few-Shot Relational Learning
  in Visually-Rich Documents
Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents
Hao Wang
Tang Li
Chenhui Chu
Nengjun Zhu
Rui-cang Wang
Pinpin Zhu
25
0
0
23 Mar 2024
Previous
12345678
Next