ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction

IEEE International Conference on Document Analysis and Recognition (ICDAR), 2019

18 March 2021

Papers citing "ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction"

50 / 219 papers shown

RealKIE: Five Novel Datasets for Enterprise Key Information Extraction

291

29 Mar 2024

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

Yuliang Liu

Fei Huang

291

28 Mar 2024

Can AI Models Appreciate Document Aesthetics? An Exploration of Legibility and Layout Quality in Relation to Prediction Confidence

266

27 Mar 2024

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Han Xiao

370

218

25 Mar 2024

Visually Guided Generative Text-Layout Pre-training for Document Intelligence

Xin Jiang

Qun Liu

Kam-Fai Wong

230

25 Mar 2024

From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation ModelsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2024

480

18 Mar 2024

The future of document indexing: GPT and Donut revolutionize table of content processing

312

12 Mar 2024

Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis

191

06 Mar 2024

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

Hongshen Xu

Kai Yu

156

28 Feb 2024

LAPDoc: Layout-Aware Prompting for Documents

314

15 Feb 2024

Lumos : Empowering Multimodal LLMs with Scene Text Recognition

...

Anuj Kumar

226

12 Feb 2024

TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing

206

07 Feb 2024

PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition

467

07 Feb 2024

ANLS* -- A Universal Document Processing Metric for Generative Large Language Models

313

06 Feb 2024

LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents

Ahmed Masry

Amir Hajian

145

26 Jan 2024

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with InstructionsAAAI Conference on Artificial Intelligence (AAAI), 2024

263

24 Jan 2024

UniVIE: A Unified Label Space Approach to Visual Information Extraction from Form-like DocumentsIEEE International Conference on Document Analysis and Recognition (ICDAR), 2024

220

17 Jan 2024

PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction

Lianwen Jin

198

07 Jan 2024

DocLLM: A layout-aware generative language model for multimodal document understandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

283

112

31 Dec 2023

Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey

M. Kasem

H. Kang

289

19 Dec 2023

Toward Real Text Manipulation Detection: New Dataset and New Solution

Yuliang Liu

213

12 Dec 2023

EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images

A. Singh

Venkatapathy Subramanian

Ayush Maheshwari

Pradeep Narayan

D. P. Shetty

Ganesh Ramakrishnan

130

23 Nov 2023

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs

Hao Feng

302

22 Nov 2023

FATURA: A Multi-Layout Invoice Image Dataset for Document Analysis and Understanding

Mahmoud Limam

M. Dhiaf

Yousri Kessentini

178

20 Nov 2023

On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity RetrievalConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

295

01 Nov 2023

Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich DocumentsIEEE International Joint Conference on Neural Network (IJCNN), 2023

Tofik Ali

Partha Pratim Roy

207

25 Oct 2023

GenKIE: Robust Generative Multimodal Document Key Information ExtractionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

174

24 Oct 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

204

23 Oct 2023

Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path PredictionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

208

17 Oct 2023

PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction

184

05 Oct 2023

ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented BenchmarksACM Multimedia (ACM MM), 2023

...

Xuanjing Huang

314

04 Oct 2023

Kosmos-2.5: A Multimodal Literate Model

...

267

20 Sep 2023

AMuRD: Annotated Arabic-English Receipt Dataset for Key Information Extraction and Classification

150

18 Sep 2023

Long-Range Transformer Architectures for Document Understanding

188

11 Sep 2023

Improving Information Extraction on Business Documents with Specific Pre-Training TasksInternational Workshop on Document Analysis Systems (DAS), 2023

158

11 Sep 2023

ImageBind-LLM: Multi-modality Instruction Tuning

...

Yu Qiao

294

156

07 Sep 2023

Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region ConcentrationIEEE International Conference on Computer Vision (ICCV), 2023

202

03 Sep 2023

DTrOCR: Decoder-only Transformer for Optical Character RecognitionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Masato Fujitake

442

30 Aug 2023

Universal Graph Continual Learning

243

27 Aug 2023

Beyond Document Page Classification: Design, Datasets, and ChallengesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Jordy Van Landeghem

Sanket Biswas

Matthew B. Blaschko

Marie-Francine Moens

225

24 Aug 2023

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual QuestionsAAAI Conference on Artificial Intelligence (AAAI), 2023

353

190

19 Aug 2023

Tiny LVLM-eHub: Early Multimodal Experiments with BardIEEE Transactions on Big Data (IEEE Trans. Big Data), 2023

...

Ping Luo

212

07 Aug 2023

Workshop on Document Intelligence UnderstandingInternational Conference on Information and Knowledge Management (CIKM), 2023

121

31 Jul 2023

MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary

265

24 Jul 2023

Line Graphics Digitization: A Step Towards Full AutomationIEEE International Conference on Document Analysis and Recognition (ICDAR), 2023

121

05 Jul 2023

Estimating Post-OCR Denoising Complexity on Numerical TextsAsian Conference on Intelligent Information and Database Systems (ACIIDS), 2023

127

03 Jul 2023

Document Image Cleaning using Budget-Aware Black-Box Approximation

127

22 Jun 2023

LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language ModelsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Yu Qiao

Ping Luo

ELM MLLM

312

232

15 Jun 2023

DocumentNet: Bridging the Data Gap in Document Pre-TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Alexander G. Hauptmann

H. Dai

Wei Wei

123

15 Jun 2023

Looking and Listening: Audio Guided Text Recognition

Yuliang Liu

160

06 Jun 2023