v1v2 (latest)

A Survey of Deep Learning Approaches for OCR and Document Understanding

27 November 2020

Papers citing "A Survey of Deep Learning Approaches for OCR and Document Understanding"

24 / 24 papers shown

Robustness of Structured Data Extraction from Perspectively Distorted Documents

Hyakka Nakada

Yoshiyasu Tanaka

18 Nov 2025

Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding

425

17 Oct 2025

DocReward: A Document Reward Model for Structuring and Stylizing

...

204

13 Oct 2025

Multi-Modal Vision vs. Text-Based Parsing: Benchmarking LLM Strategies for Invoice Processing

129

29 Aug 2025

Finding Needles in Images: Can Multimodal LLMs Locate Fine Details?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

Chaitanya Devaguptapu

166

07 Aug 2025

OCRGenBench: A Comprehensive Benchmark for Evaluating OCR Generative Capabilities

377

20 Jul 2025

Towards Visual Text Grounding of Multimodal Large Language Model

500

07 Apr 2025

Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive ReviewAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

451

23 Feb 2025

VORTEX: A Spatial Computing Framework for Optimized Drone Telemetry Extraction from First-Person View Flight Data

James E. Gallagher

E. Oughton

207

24 Dec 2024

PerSRV: Personalized Sticker Retrieval with Vision-Language ModelThe Web Conference (WWW), 2024

228

29 Oct 2024

Towards an Improved Metric for Evaluating Disentangled Representations

Sahib Julka

Yashu Wang

Michael Granitzer

228

04 Oct 2024

Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5Jahrestagung der Gesellschaft für Informatik (GI Jahrestagung), 2024

Marcel Lamott

Muhammad Armaghan Shakir

231

17 Sep 2024

Image-to-LaTeX Converter for Mathematical Formulas and Text

Daniil Gurgurov

Aleksey Morshnev

ViT VLM

263

07 Aug 2024

Deep Learning based Visually Rich Document Content Understanding: A Survey

567

02 Aug 2024

MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data

Renqing Luo

Yuhan Xu

271

24 Jun 2024

Reconstructing training data from document understanding models

358

05 Jun 2024

Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis

283

06 Mar 2024

Handwritten and Printed Text Segmentation: A Signature Case StudyIEEE International Conference on Computer Vision (ICCV), 2023

Sina Gholamian

Ali Vahdat

188

15 Jul 2023

TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain

Sagar Chakraborty

Gaurav Harit

Saptarshi Ghosh

216

03 Jun 2023

Literature Review: Computer Vision Applications in Transportation Logistics and Warehousing

380

12 Apr 2023

Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts

Queenie Luo

Yung-Sung Chuang

251

07 Apr 2023

Towards Complex Document Understanding By Discrete ReasoningACM Multimedia (ACM MM), 2022

469

25 Jul 2022

Detection Masking for Improved OCR on Noisy Documents

271

17 May 2022

DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information ExtractionIEEE International Conference on Document Analysis and Recognition (ICDAR), 2021

Freddy Chongtat Chua

Nigel P. Duffy

222

10 Mar 2021