Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.10213
Cited By
ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2019
18 March 2021
Zheng Huang
Kai Chen
Jianhua He
X. Bai
Dimosthenis Karatzas
Shijian Lu
C. V. Jawahar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction"
50 / 219 papers shown
RealKIE: Five Novel Datasets for Enterprise Key Information Extraction
Benjamin Townsend
Madison May
Katherine Mackowiak
Christopher Wells
SyDa
291
1
0
29 Mar 2024
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
291
74
0
28 Mar 2024
Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence
Hsiu-Wei Yang
Abhinav Agrawal
Pavlos Fragkogiannis
Shubham Nitin Mulay
266
3
0
27 Mar 2024
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
Hao Shao
Shengju Qian
Han Xiao
Guanglu Song
Zhuofan Zong
Letian Wang
Yu Liu
Jiaming Song
VGen
LRM
MLLM
370
218
0
25 Mar 2024
Visually Guided Generative Text-Layout Pre-training for Document Intelligence
Zhiming Mao
Haoli Bai
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
Kam-Fai Wong
230
13
0
25 Mar 2024
From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
Kung-Hsiang Huang
Hou Pong Chan
Yi R. Fung
Haoyi Qiu
Mingyang Zhou
Shafiq Joty
Shih-Fu Chang
Chenhui Xu
AI4TS
480
57
0
18 Mar 2024
The future of document indexing: GPT and Donut revolutionize table of content processing
Degaga Wolde Feyisa
Haylemicheal Berihun
Amanuel Zewdu
Mahsa Najimoghadam
Marzieh Zare
312
3
0
12 Mar 2024
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis
Abdelrahman Abdallah
Daniel Eberharter
Zoe Pfister
Adam Jatowt
191
16
0
06 Mar 2024
Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding
Hongshen Xu
Lu Chen
Zihan Zhao
Da Ma
Ruisheng Cao
Zichen Zhu
Kai Yu
156
5
0
28 Feb 2024
LAPDoc: Layout-Aware Prompting for Documents
Marcel Lamott
Yves-Noel Weweler
A. Ulges
Faisal Shafait
Dirk Krechel
Darko Obradovic
314
18
0
15 Feb 2024
Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Ashish Shenoy
Yichao Lu
Srihari Jayakumar
Debojeet Chatterjee
Mohsen Moslehpour
...
Shicong Zhao
Longfang Zhao
Ankit Ramchandani
Xin Luna Dong
Anuj Kumar
MLLM
226
6
0
12 Feb 2024
TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing
Ran Zmigrod
Zhiqiang Ma
Armineh Nourbakhsh
Sameena Shah
206
5
0
07 Feb 2024
PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
Jinghui Lu
Ziwei Yang
Yanjie Wang
Xuejing Liu
Brian Mac Namee
Can Huang
MoE
467
14
0
07 Feb 2024
ANLS* -- A Universal Document Processing Metric for Generative Large Language Models
David Peer
Philemon Schöpf
V. Nebendahl
A. Rietzler
Sebastian Stabinger
313
8
0
06 Feb 2024
LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents
Ahmed Masry
Amir Hajian
145
6
0
26 Jan 2024
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions
AAAI Conference on Artificial Intelligence (AAAI), 2024
Ryota Tanaka
Taichi Iki
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
263
36
0
24 Jan 2024
UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like Documents
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2024
Kai Hu
Jiawei Wang
Weihong Lin
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
220
1
0
17 Jan 2024
PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction
Zening Lin
Jiapeng Wang
Teng Li
Wenhui Liao
Dayi Huang
Longfei Xiong
Lianwen Jin
198
3
0
07 Jan 2024
DocLLM: A layout-aware generative language model for multimodal document understanding
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Dongsheng Wang
Natraj Raman
Mathieu Sibue
Zhiqiang Ma
Petr Babkin
Simerjot Kaur
Yulong Pei
Armineh Nourbakhsh
Xiaomo Liu
VLM
283
112
0
31 Dec 2023
Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey
M. Kasem
M. Kasem
H. Kang
289
13
0
19 Dec 2023
Toward Real Text Manipulation Detection: New Dataset and New Solution
Dongliang Luo
Yuliang Liu
Rui Yang
Xianjin Liu
Jishen Zeng
Yu Zhou
Xiang Bai
213
11
0
12 Dec 2023
EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images
A. Singh
Venkatapathy Subramanian
Ayush Maheshwari
Pradeep Narayan
D. P. Shetty
Ganesh Ramakrishnan
130
3
0
23 Nov 2023
Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs
Yonghui Wang
Wen-gang Zhou
Hao Feng
Keyi Zhou
Houqiang Li
302
25
0
22 Nov 2023
FATURA: A Multi-Layout Invoice Image Dataset for Document Analysis and Understanding
Mahmoud Limam
M. Dhiaf
Yousri Kessentini
178
3
0
20 Nov 2023
On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiayi Chen
H. Dai
Bo Dai
Aidong Zhang
Wei Wei
295
3
0
01 Nov 2023
Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Tofik Ali
Partha Pratim Roy
207
0
0
25 Oct 2023
GenKIE: Robust Generative Multimodal Document Key Information Extraction
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Panfeng Cao
Ye Wang
Qiang Zhang
Zaiqiao Meng
SyDa
174
9
0
24 Oct 2023
Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hao Wang
Xiahua Chen
Rui Wang
Chenhui Chu
204
1
0
23 Oct 2023
Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Chong Zhang
Ya Guo
Yi Tu
Huan Chen
Jinyang Tang
Huijia Zhu
Tao Gui
Tao Gui
3DV
208
31
0
17 Oct 2023
PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction
S. Saifullah
S. Agne
Andreas Dengel
Sheraz Ahmed
184
1
0
05 Oct 2023
ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks
ACM Multimedia (ACM MM), 2023
Zejun Li
Ye Wang
Mengfei Du
Qingwen Liu
Binhao Wu
...
Zhihao Fan
Jie Fu
Jingjing Chen
Xuanjing Huang
Zhongyu Wei
314
16
0
04 Oct 2023
Kosmos-2.5: A Multimodal Literate Model
Tengchao Lv
Yupan Huang
Jingye Chen
Lei Cui
Shuming Ma
...
Weiyao Luo
Shaoxiang Wu
Guoxin Wang
Cha Zhang
Furu Wei
VLM
MLLM
267
91
0
20 Sep 2023
AMuRD: Annotated Arabic-English Receipt Dataset for Key Information Extraction and Classification
Abdelrahman Abdallah
Mahmoud Abdalla
Ibrahim Abdelhalim
Mohamed Elkasaby
Adam Jatowt
150
1
0
18 Sep 2023
Long-Range Transformer Architectures for Document Understanding
Thibault Douzon
S. Duffner
Christophe Garcia
Jérémy Espinas
VLM
188
3
0
11 Sep 2023
Improving Information Extraction on Business Documents with Specific Pre-Training Tasks
International Workshop on Document Analysis Systems (DAS), 2023
Thibault Douzon
S. Duffner
Christophe Garcia
Jérémy Espinas
158
9
0
11 Sep 2023
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han
Renrui Zhang
Wenqi Shao
Shiyang Feng
Peng Xu
...
Yafei Wen
Xiaoxin Chen
Xiangyu Yue
Jiaming Song
Yu Qiao
MLLM
294
156
0
07 Sep 2023
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
IEEE International Conference on Computer Vision (ICCV), 2023
H. Cao
Changcun Bao
Chaohu Liu
Huang-wei Chen
Kun Yin
Hao Liu
Yinsong Liu
Deqiang Jiang
Xing Sun
202
17
0
03 Sep 2023
DTrOCR: Decoder-only Transformer for Optical Character Recognition
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Masato Fujitake
442
58
0
30 Aug 2023
Universal Graph Continual Learning
Thanh Duc Hoang
Do Viet Tung
Duy-Hung Nguyen
Bao-Sinh Nguyen
Huy Hoang Nguyen
Hung Le
CLL
243
8
0
27 Aug 2023
Beyond Document Page Classification: Design, Datasets, and Challenges
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Jordy Van Landeghem
Sanket Biswas
Matthew B. Blaschko
Marie-Francine Moens
225
9
0
24 Aug 2023
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
AAAI Conference on Artificial Intelligence (AAAI), 2023
Wenbo Hu
Y. Xu
Jian Wang
W. Li
Zhe Chen
Zhuowen Tu
MLLM
VLM
353
190
0
19 Aug 2023
Tiny LVLM-eHub: Early Multimodal Experiments with Bard
IEEE Transactions on Big Data (IEEE Trans. Big Data), 2023
Wenqi Shao
Yutao Hu
Shiyang Feng
Meng Lei
Kaipeng Zhang
...
Peng Xu
Siyuan Huang
Jiaming Song
Yuning Qiao
Ping Luo
VLM
MLLM
212
24
0
07 Aug 2023
Workshop on Document Intelligence Understanding
International Conference on Information and Knowledge Management (CIKM), 2023
S. Han
Yihao Ding
Siwen Luo
J. Poon
HeeGuen Yoon
Zhe Huang
P. Duuring
E. Holden
121
1
0
31 Jul 2023
MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary
Beiya Dai
Xingbiao Li
Qunyi Xie
Yulin Li
Xiameng Qin
Chengquan Zhang
Kun Yao
Junyu Han
265
7
0
24 Jul 2023
Line Graphics Digitization: A Step Towards Full Automation
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2023
Omar Moured
Kailai Li
Alina Roitberg
Thorsten Schwarz
Rainer Stiefelhagen
121
6
0
05 Jul 2023
Estimating Post-OCR Denoising Complexity on Numerical Texts
Asian Conference on Intelligent Information and Database Systems (ACIIDS), 2023
Arthur Hemmer
Jérôme Brachat
Mickael Coustaty
J. Ogier
127
3
0
03 Jul 2023
Document Image Cleaning using Budget-Aware Black-Box Approximation
Ganesh Tata
Katyani Singh
E. V. Oeveren
Nilanjan Ray
AAML
127
0
0
22 Jun 2023
LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Peng Xu
Wenqi Shao
Kaipeng Zhang
Shiyang Feng
Shuo Liu
Meng Lei
Fanqing Meng
Siyuan Huang
Yu Qiao
Ping Luo
ELM
MLLM
312
232
0
15 Jun 2023
DocumentNet: Bridging the Data Gap in Document Pre-Training
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Lijun Yu
Jin Miao
Xiaoyu Sun
Jiayi Chen
Alexander G. Hauptmann
H. Dai
Wei Wei
123
4
0
15 Jun 2023
Looking and Listening: Audio Guided Text Recognition
Wenwen Yu
Mingyu Liu
Biao Yang
Enming Zhang
Deqiang Jiang
Xing Sun
Yuliang Liu
Xiang Bai
DiffM
160
1
0
06 Jun 2023
Previous
1
2
3
4
5
Next
Page 3 of 5