Vision Grid Transformer for Document Layout Analysis

29 August 2023

Papers citing "Vision Grid Transformer for Document Layout Analysis"

9 / 9 papers shown

Title
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models Jonathan Bourne 75 0 0 24 Feb 2025
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data Yufan Shen Chuwei Luo Zhaoqing Zhu Yang Chen Qi Zheng Zhi Yu Jiajun Bu Cong Yao 36 2 0 17 Jul 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications Jordy Van Landeghem Subhajit Maity Ayan Banerjee Matthew Blaschko Marie-Francine Moens Josep Lladós Sanket Biswas 41 2 0 12 Jun 2024
Object Recognition from Scientific Document based on Compartment Refinement Framework Jinghong Li Wen Gu Koichi Ota Shinobu Hasegawa 23 2 0 14 Dec 2023
Levenshtein OCR Cheng Da P. Wang Cong Yao ViT 71 32 0 08 Sep 2022
Multi-Granularity Prediction for Scene Text Recognition P. Wang Cheng Da Cong Yao 66 48 0 08 Sep 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Yang Xu Yiheng Xu Tengchao Lv Lei Cui Furu Wei ... D. Florêncio Cha Zhang Wanxiang Che Min Zhang Lidong Zhou ViT MLLM 145 498 0 29 Dec 2020
Feature Pyramid Networks for Object Detection Tsung-Yi Lin Piotr Dollár Ross B. Girshick Kaiming He Bharath Hariharan Serge J. Belongie ObjD 166 21,643 0 09 Dec 2016
Aggregated Residual Transformations for Deep Neural Networks Saining Xie Ross B. Girshick Piotr Dollár Z. Tu Kaiming He 261 10,196 0 16 Nov 2016