LayoutLM: Pre-training of Text and Layout for Document Image Understanding

31 December 2019

Papers citing "LayoutLM: Pre-training of Text and Layout for Document Image Understanding"

50 / 371 papers shown

Title
Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering Nikhil Maddikunta Huijun Zhao Sumit Keswani Alfy Samuel Fu-Ming Guo Nishan Srishankar Vishwa Pardeshi Austin Huang VGen 26 1 0 16 Dec 2021
Value Retrieval with Arbitrary Queries for Form-like Documents M. Gao Le Xue Chetan Ramaiah Chen Xing Ran Xu Caiming Xiong 21 6 0 15 Dec 2021
Text Classification Models for Form Entity Linking M. Villota C. Domínguez Jónathan Heras Eloy J. Mata Vico Pascual MedIm 26 2 0 14 Dec 2021
OCR-free Document Understanding Transformer Geewook Kim Teakgyu Hong Moonbin Yim Jeongyeon Nam Jinyoung Park Jinyeong Yim Wonseok Hwang Sangdoo Yun Dongyoon Han Seunghyun Park ViT 65 264 0 30 Nov 2021
Neural Collaborative Graph Machines for Table Structure Recognition Hao Liu Xin Li Bin Liu Deqiang Jiang Yinsong Liu Bo Ren LMTD 24 31 0 26 Nov 2021
Document AI: Benchmarks, Models and Applications Lei Cui Yiheng Xu Tengchao Lv Furu Wei VLM 24 70 0 16 Nov 2021
Synthetic Document Generator for Annotation-free Layout Recognition Natraj Raman Sameena Shah Manuela Veloso 42 10 0 11 Nov 2021
ICDAR 2021 Competition on Document VisualQuestion Answering Rubèn Pérez Tito Minesh Mathew C. V. Jawahar Ernest Valveny Dimosthenis Karatzas 40 23 0 10 Nov 2021
Information Extraction from Visually Rich Documents with Font Style Embeddings Ismail Oussaid William Vanhuffel Pirashanth Ratnamogan Mhamed Hajaiej Alexis Mathey Thomas Gilles 19 1 0 07 Nov 2021
Entity Relation Extraction as Dependency Parsing in Visually Rich Documents Yue Zhang Bo Zhang Rui Wang Junjie Cao Chen Li Zuyi Bao 40 32 0 19 Oct 2021
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding Junlong Li Yiheng Xu Lei Cui Furu Wei VLM 3DGS 33 59 0 16 Oct 2021
Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks Le Xue M. Gao Zeyuan Chen Caiming Xiong Ran Xu 13 3 0 08 Oct 2021
Asking questions on handwritten document collections Minesh Mathew Lluís Gómez Dimosthenis Karatzas C. V. Jawahar RALM 33 11 0 02 Oct 2021
OPAD: An Optimized Policy-based Active Learning Framework for Document Content Analysis Sumit Shekhar Bhanu Prakash Reddy Guda Ashutosh Chaubey Ishan Jindal Avanish Jain 33 0 0 01 Oct 2021
One-shot Key Information Extraction from Document with Deep Partial Graph Matching Minghong Yao Zhiguang Liu Liangwei Wang Houqiang Li Liansheng Zhuang 24 4 0 26 Sep 2021
Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information? Julia Rozanova Deborah Ferreira K. Dubba Weiwei Cheng Dell Zhang André Freitas LM&Ro 40 8 0 17 Sep 2021
Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers Mélodie Boillet Martin Maarand Thierry Paquet Christopher Kermorvant 16 3 0 17 Sep 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences Peng-Jen Chen 36 21 0 06 Sep 2021
Skim-Attention: Learning to Focus via Document Layout Laura Nguyen Thomas Scialom Jacopo Staiano Benjamin Piwowarski 27 9 0 02 Sep 2021
Position Masking for Improved Layout-Aware Document Understanding Anik Saha Catherine Finegan-Dollak Ashish Verma 24 2 0 01 Sep 2021
LayoutReader: Pre-training of Text and Layout for Reading Order Detection Zilong Wang Yiheng Xu Lei Cui Jingbo Shang Furu Wei 39 75 0 26 Aug 2021
Using Neighborhood Context to Improve Information Extraction from Visual Documents Captured on Mobile Phones Kalpa Gunaratna Vijay Srinivasan Sandeep Nama Hongxia Jin 27 5 0 23 Aug 2021
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents Teakgyu Hong Donghyun Kim Mingi Ji Wonseok Hwang Daehyun Nam Sungrae Park VLM 34 152 0 10 Aug 2021
StrucTexT: Structured Text Understanding with Multi-Modal Transformers Yulin Li Yuxi Qian Yuchen Yu Xiameng Qin Chengquan Zhang Yan Liu Kun Yao Junyu Han Jingtuo Liu Errui Ding 29 114 0 06 Aug 2021
Human-In-The-Loop Document Layout Analysis Xingjiao Wu Tianlong Ma Xin Li Qin Chen Liangbo He 32 2 0 04 Aug 2021
Form2Seq : A Framework for Higher-Order Form Structure Extraction Milan Aggarwal Hiresh Gupta Mausoom Sarkar Balaji Krishnamurthy 3DV 9 23 0 09 Jul 2021
Efficient Document Image Classification Using Region-Based Graph Neural Network J. Mandivarapu Eric Bunch Qian You G. Fung VLM 11 7 0 25 Jun 2021
MatchVIE: Exploiting Match Relevancy between Entities for Visual Information Extraction Guozhi Tang Lele Xie Lianwen Jin Jiapeng Wang Jingdong Chen Zhen Xu Qianying Wang Yaqiang Wu Hui Li 20 29 0 24 Jun 2021
DocFormer: End-to-End Transformer for Document Understanding Srikar Appalaraju Bhavan A. Jasani Bhargava Urala Kota Yusheng Xie R. Manmatha ViT 41 273 0 22 Jun 2021
Tag, Copy or Predict: A Unified Weakly-Supervised Learning Framework for Visual Information Extraction using Sequences Jiapeng Wang Tianwei Wang Guozhi Tang Lianwen Jin Weihong Ma Kai Ding Yichao Huang 28 12 0 20 Jun 2021
SelfDoc: Self-Supervised Document Representation Learning Peizhao Li Jiuxiang Gu Jason Kuen Vlad I. Morariu Handong Zhao R. Jain Varun Manjunatha Hongfu Liu ViT SSL 28 160 0 07 Jun 2021
End-to-End Hierarchical Relation Extraction for Generic Form Understanding Tuan-Anh Dang Nguyen Duc Thanh Hoang Q. Tran Chih-Wei Pan T. Nguyen 19 10 0 02 Jun 2021
A Span Extraction Approach for Information Extraction on Visually-Rich Documents Tuan-Anh Dang Nguyen Hieu M. Vu Nguyen Hong Son Minh-Tien Nguyen 24 6 0 02 Jun 2021
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups Zejiang Shen Kyle Lo Lucy Lu Wang Bailey Kuehl Daniel S. Weld Doug Downey VLM 24 34 0 01 Jun 2021
Understanding Mobile GUI: from Pixel-Words to Screen-Sentences Jingwen Fu Xiaoyi Zhang Yuwang Wang Wenjun Zeng Sam Yang Grayson Hilliard 29 14 0 25 May 2021
ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents Weihong Lin Qifang Gao Lei-huan Sun Zhuoyao Zhong Kaiqin Hu Qin Ren Qiang Huo 31 39 0 25 May 2021
StructuralLM: Structural Pre-training for Form Understanding Chenliang Li Bin Bi Ming Yan Wei Wang Songfang Huang Fei Huang Luo Si LMTD AI4CE 39 132 0 24 May 2021
ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale Yuanming Li Huaizheng Zhang Shanshan Jiang Fan Yang Yonggang Wen Yong Luo 21 2 0 18 May 2021
Visual FUDGE: Form Understanding via Dynamic Graph Editing Brian L. Davis B. Morse Brian L. Price Chris Tensmeyer Curtis Wigington AI4CE 21 19 0 17 May 2021
Doc2Dict: Information Extraction as Text Generation Benjamin Townsend Eamon Ito-Fisher Lily Zhang Madison May 28 7 0 16 May 2021
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations Peng Zhang Can Li Liang Qiao Zhanzhan Cheng Shiliang Pu Yi Niu Fei Wu 31 57 0 13 May 2021
Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts Tomasz Stanislawek Filip Graliñski Anna Wróblewska Dawid Lipiñski Agnieszka Kaliska Paulina Rosalska Bartosz Topolski P. Biecek 33 92 0 12 May 2021
GroupLink: An End-to-end Multitask Method for Word Grouping and Relation Extraction in Form Understanding Zilong Wang Mingjie Zhan Houxing Ren Zhaohui Hou Yuwei Wu Xingyan Zhang Ding Liang 14 1 0 10 May 2021
DocReader: Bounding-Box Free Training of a Document Information Extraction Model S. Klaiman Marius Lehne 29 6 0 10 May 2021
MathBERT: A Pre-Trained Model for Mathematical Formula Understanding Shuai Peng Ke Yuan Liangcai Gao Zhi Tang AIMat 49 107 0 02 May 2021
Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser Yuta Koreeda Christopher D. Manning 61 3 0 01 May 2021
InfographicVQA Minesh Mathew Viraj Bagal Rubèn Pérez Tito Dimosthenis Karatzas Ernest Valveny C. V. Jawahar 42 209 0 26 Apr 2021
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding Yiheng Xu Tengchao Lv Lei Cui Guoxin Wang Yijuan Lu D. Florêncio Cha Zhang Furu Wei MLLM VLM 38 128 0 18 Apr 2021
LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding Te-Lin Wu Cheng-rong Li Mingyang Zhang Tao Chen Spurthi Amba Hombaiah Michael Bendersky 21 14 0 16 Apr 2021
Cost-effective End-to-end Information Extraction for Semi-structured Document Images Wonseok Hwang Hyunji Lee Jinyeong Yim Geewook Kim Minjoon Seo 3DV 68 24 0 16 Apr 2021