Unified Pretraining Framework for Document Understanding

22 April 2022

Jiuxiang Gu

Papers citing "Unified Pretraining Framework for Document Understanding"

22 / 22 papers shown

Title
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations Linke Ouyang Yuan Qu Hongbin Zhou Jiawei Zhu Rui Zhang ... Chao Xu Bo Zhang Botian Shi Zhongying Tu Conghui He 99 5 0 10 Dec 2024
DocMamba: Efficient Document Pre-training with State Space Model Pengfei Hu Zhenrong Zhang Jiefeng Ma Shuhang Liu Jun Du Jianshu Zhang Mamba 37 1 0 18 Sep 2024
ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data Yufan Shen Chuwei Luo Zhaoqing Zhu Yang Chen Qi Zheng Zhi Yu Jiajun Bu Cong Yao 40 2 0 17 Jul 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications Jordy Van Landeghem Subhajit Maity Ayan Banerjee Matthew Blaschko Marie-Francine Moens Josep Lladós Sanket Biswas 41 2 0 12 Jun 2024
A Hybrid Approach for Document Layout Analysis in Document images Tahira Shehzadi Didier Stricker Muhammad Zeshan Afzal 29 5 0 27 Apr 2024
PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering Yihao Ding Kaixuan Ren Jiabin Huang Siwen Luo S. Han 35 1 0 19 Apr 2024
Noise-Aware Training of Layout-Aware Language Models Ritesh Sarkhel Xiaoqi Ren Lauro Beltrao Costa Guolong Su Vincent Perot Yanan Xie Emmanouil Koukoumidis Arnab Nandi VLM 42 0 0 30 Mar 2024
DOCMASTER: A Unified Platform for Annotation, Training, & Inference in Document Question-Answering Alex Nguyen Zilong Wang Jingbo Shang Dheeraj Mekala 33 1 0 30 Mar 2024
TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing Ran Zmigrod Zhiqiang Ma Armineh Nourbakhsh Sameena Shah 24 4 0 07 Feb 2024
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap Daehee Kim Yoon Kim Donghyun Kim Yumin Lim Geewook Kim Taeho Kil 23 3 0 21 Sep 2023
A Graphical Approach to Document Layout Analysis Jilin Wang Michael Krumdick Baojia Tong Hamima Halim M. Sokolov Vadym Barda Delphine Vendryes Christy Tanner 21 8 0 03 Aug 2023
On Evaluation of Document Classification using RVL-CDIP Stefan Larson Gordon Lim Kevin Leach 26 3 0 21 Jun 2023
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path Zilong Wang Jingbo Shang 36 0 0 23 May 2023
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild Zhibo Yang Rujiao Long Pengfei Wang Sibo Song Humen Zhong Wenqing Cheng X. Bai Cong Yao 29 19 0 23 Mar 2023
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models Lei Wang Jian He Xingdong Xu Ning Liu Hui-juan Liu 31 2 0 27 Nov 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers Stefan Larson Gordon Lim Yutong Ai David Kuang Kevin Leach OODD OOD 31 18 0 14 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding Jingye Chen Tengchao Lv Lei Cui Changrong Zhang Furu Wei 48 13 0 06 Oct 2022
Knowing Where and What: Unified Word Block Pretraining for Document Understanding Song Tao Zijian Wang Tiantian Fan Canjie Luo Can Huang SSL 27 2 0 28 Jul 2022
Test-Time Adaptation for Visual Document Understanding Sayna Ebrahimi Sercan Ö. Arik Tomas Pfister OOD 31 6 0 15 Jun 2022
Relational Representation Learning in Visually-Rich Documents Xin Li Yan Zheng Yiqing Hu H. Cao Yunfei Wu Deqiang Jiang Yinsong Liu Bo Ren 18 12 0 05 May 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Yang Xu Yiheng Xu Tengchao Lv Lei Cui Furu Wei ... D. Florêncio Cha Zhang Wanxiang Che Min Zhang Lidong Zhou ViT MLLM 145 498 0 29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents Guillaume Jaume H. K. Ekenel Jean-Philippe Thiran 128 355 0 27 May 2019