Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.11591
Cited By
LayoutReader: Pre-training of Text and Layout for Reading Order Detection
26 August 2021
Zilong Wang
Yiheng Xu
Lei Cui
Jingbo Shang
Furu Wei
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LayoutReader: Pre-training of Text and Layout for Reading Order Detection"
42 / 42 papers shown
Title
XY-Cut++: Advanced Layout Ordering via Hierarchical Mask Mechanism on a Novel Benchmark
Shuai Liu
Youmeng Li
Jizeng Wei
33
0
0
14 Apr 2025
Towards Visual Text Grounding of Multimodal Large Language Model
Ming Li
Ruiyi Zhang
Jian Chen
Jiuxiang Gu
Yufan Zhou
Franck Dernoncourt
Wanrong Zhu
Tianyi Zhou
Tong Sun
33
2
0
07 Apr 2025
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Qiang Huo
53
0
0
20 Mar 2025
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation
Hong Cai Chen
Longchang Wu
Yang Zhang
38
0
0
23 Feb 2025
OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models
Wenwen Yu
Zhibo Yang
Jianqiang Wan
Sibo Song
J. Tang
Wenqing Cheng
Y. Liu
Xiang Bai
46
1
0
22 Feb 2025
ReLayout: Towards Real-World Document Understanding via Layout-enhanced Pre-training
Zhouqiang Jiang
Bowen Wang
Junhao Chen
Yuta Nakashima
22
2
0
14 Oct 2024
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding
Chong Zhang
Yi Tu
Yixi Zhao
Chenshu Yuan
Huan Chen
...
Mingxu Chai
Ya Guo
Huijia Zhu
Qi Zhang
Tao Gui
41
2
0
29 Sep 2024
READoc: A Unified Benchmark for Realistic Document Structured Extraction
Zichao Li
Aizier Abulaiti
Y. Lu
Xuanang Chen
Jia Zheng
Hongyu Lin
Xianpei Han
Le Sun
27
4
0
08 Sep 2024
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding
Chuanghao Ding
Xuejing Liu
Wei Tang
Juan Li
Xiaoliang Wang
Rui Zhao
Cam-Tu Nguyen
Fei Tan
23
0
0
27 Aug 2024
Deep Learning based Visually Rich Document Content Understanding: A Survey
Muhammad Ali
Jean Lee
Salman Khan
34
6
0
02 Aug 2024
UNER: A Unified Prediction Head for Named Entity Recognition in Visually-rich Documents
Yi Tu
Chong Zhang
Ya Guo
Huan Chen
Jinyang Tang
Huijia Zhu
Qi Zhang
43
3
0
02 Aug 2024
OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
Zilong Wang
Yuedong Cui
Li Zhong
Zimin Zhang
Da Yin
Bill Yuchen Lin
Jingbo Shang
51
4
0
26 Jul 2024
Block-level Text Spotting with LLMs
Ganesh Bannur
Bharadwaj Amrutur
26
0
0
19 Jun 2024
ACCSAMS: Automatic Conversion of Exam Documents to Accessible Learning Material for Blind and Visually Impaired
David Wilkening
Omar Moured
Thorsten Schwarz
Karin Muller
Rainer Stiefelhagen
18
0
0
29 May 2024
DLAFormer: An End-to-End Transformer For Document Layout Analysis
Jiawei Wang
Kai Hu
Qiang Huo
3DV
ViT
25
3
0
20 May 2024
Reading Order Independent Metrics for Information Extraction in Handwritten Documents
David Villanova-Aparisi
Solène Tarride
Carlos David Martínez Hinarejos
Verónica Romero
Christopher Kermorvant
Moisés Pastor
16
0
0
29 Apr 2024
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
37
26
0
28 Mar 2024
Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents
Hao Wang
Tang Li
Chenhui Chu
Nengjun Zhu
Rui-cang Wang
Pinpin Zhu
23
0
0
23 Mar 2024
Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
25
6
0
22 Jan 2024
DocGraphLM: Documental Graph Language Model for Information Extraction
Dongsheng Wang
Zhiqiang Ma
Armineh Nourbakhsh
Kang Gu
Sameena Shah
26
8
0
05 Jan 2024
WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data
Maurice Weber
Carlo Siebenschuh
Rory Butler
Anton Alexandrov
Valdemar Thanner
...
Haris Jabbar
Ian T. Foster
Bo-wen Li
Rick L. Stevens
Ce Zhang
13
4
0
15 Dec 2023
LANS: A Layout-Aware Neural Solver for Plane Geometry Problem
Zhong-Zhi Li
Ming-Liang Zhang
Fei Yin
Cheng-Lin Liu
13
11
0
25 Nov 2023
Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs
Yonghui Wang
Wen-gang Zhou
Hao Feng
Keyi Zhou
Houqiang Li
52
18
0
22 Nov 2023
On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval
Jiayi Chen
H. Dai
Bo Dai
Aidong Zhang
Wei Wei
21
2
0
01 Nov 2023
DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading
Hao Wang
Qingxuan Wang
Yue Li
Changqing Wang
Chenhui Chu
Rui-cang Wang
VGen
18
3
0
23 Oct 2023
Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction
Chong Zhang
Ya Guo
Yi Tu
Huan Chen
Jinyang Tang
Huijia Zhu
Qi Zhang
Tao Gui
3DV
26
20
0
17 Oct 2023
Analyzing the Efficacy of an LLM-Only Approach for Image-based Document Question Answering
Nidhi Hegde
S. Paul
Gagan Madan
Gaurav Aggarwal
20
8
0
25 Sep 2023
Kosmos-2.5: A Multimodal Literate Model
Tengchao Lv
Yupan Huang
Jingye Chen
Lei Cui
Shuming Ma
...
Weiyao Luo
Shaoxiang Wu
Guoxin Wang
Cha Zhang
Furu Wei
VLM
MLLM
23
63
0
20 Sep 2023
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling
Qiwei Li
Z. Li
Xiantao Cai
Bo Du
Hai Zhao
28
7
0
15 Aug 2023
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
Yi Tu
Ya Guo
Huan Chen
Jinyang Tang
29
15
0
30 May 2023
Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation
Renshen Wang
Yasuhisa Fujii
Alessandro Bissacco
GNN
16
6
0
04 May 2023
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding
Haoli Bai
Zhiguang Liu
Xiaojun Meng
Wentao Li
Shuangning Liu
...
Liangwei Wang
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
ViT
22
11
0
19 Dec 2022
Multimodal Tree Decoder for Table of Contents Extraction in Document Images
Pengfei Hu
Zhenrong Zhang
Jianshu Zhang
Jun Du
Jiajia Wu
23
12
0
06 Dec 2022
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Qiming Peng
Yinxu Pan
Wenjin Wang
Bin Luo
Zhenyu Zhang
...
Shi Feng
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
8
83
0
12 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
48
13
0
06 Oct 2022
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding
Liang Qiao
Hui Jiang
Ying Chen
Can Li
Pengfei Li
...
Dashan Guo
Yi Xu
Yunlu Xu
Zhanzhan Cheng
Yi Niu
16
5
0
14 Jul 2022
Towards Optimizing OCR for Accessibility
Peya Mowar
T. Ganu
Saikat Guha
14
1
0
21 Jun 2022
Relational Representation Learning in Visually-Rich Documents
Xin Li
Yan Zheng
Yiqing Hu
H. Cao
Yunfei Wu
Deqiang Jiang
Yinsong Liu
Bo Ren
16
12
0
05 May 2022
Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework
Zilong Wang
Jingbo Shang
16
10
0
30 Mar 2022
XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding
Zhangxuan Gu
Changhua Meng
Ke Wang
Jun Lan
Weiqiang Wang
Ming Gu
Liqing Zhang
20
76
0
14 Mar 2022
Document AI: Benchmarks, Models and Applications
Lei Cui
Yiheng Xu
Tengchao Lv
Furu Wei
VLM
13
69
0
16 Nov 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
Teakgyu Hong
Donghyun Kim
Mingi Ji
Wonseok Hwang
Daehyun Nam
Sungrae Park
VLM
23
149
0
10 Aug 2021
1