Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval

25 February 2015

Papers citing "Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval"

50 / 187 papers shown

Title
Image Generation and Learning Strategy for Deep Document Forgery Detection Yamato Okamoto Osada Genki Iu Yahiro Rintaro Hasegawa Peifei Zhu Hirokatsu Kataoka AAML 36 0 0 07 Nov 2023
On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval Jiayi Chen H. Dai Bo Dai Aidong Zhang Wei Wei 36 2 0 01 Nov 2023
Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents Tofik Ali Partha Pratim Roy 16 0 0 25 Oct 2023
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap Daehee Kim Yoon Kim Donghyun Kim Yumin Lim Geewook Kim Taeho Kil 34 3 0 21 Sep 2023
Long-Range Transformer Architectures for Document Understanding Thibault Douzon S. Duffner Christophe Garcia Jérémy Espinas VLM 31 2 0 11 Sep 2023
Vision Grid Transformer for Document Layout Analysis Cheng Da Chuwei Luo Qi Zheng Cong Yao ViT 40 28 0 29 Aug 2023
Beyond Document Page Classification: Design, Datasets, and Challenges Jordy Van Landeghem Sanket Biswas Matthew B. Blaschko Marie-Francine Moens 40 6 0 24 Aug 2023
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling Qiwei Li Z. Li Xiantao Cai Bo Du Hai Zhao 28 7 0 15 Aug 2023
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding Yanzhe Zhang Ruiyi Zhang Jiuxiang Gu Yufan Zhou Nedim Lipka Diyi Yang Tongfei Sun VLM MLLM 27 219 0 29 Jun 2023
On Evaluation of Document Classification using RVL-CDIP Stefan Larson Gordon Lim Kevin Leach 36 3 0 21 Jun 2023
Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models Jiabang He Yilang Hu Lei Wang Xingdong Xu Ning Liu Hui-juan Liu Hengtao Shen VLM OOD 24 2 0 05 Jun 2023
TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain Sagar Chakraborty Gaurav Harit Saptarshi Ghosh 18 1 0 03 Jun 2023
DocFormerv2: Local Features for Document Understanding Srikar Appalaraju Peng Tang Qi Dong Nishant Sankaran Yichu Zhou R. Manmatha 33 39 0 02 Jun 2023
DWT-CompCNN: Deep Image Classification Network for High Throughput JPEG 2000 Compressed Documents Tejasvee Bisen M. Javed Shashank Kirtania P. Nagabhushan 13 1 0 02 Jun 2023
End-to-End Document Classification and Key Information Extraction using Assignment Optimization Ciaran Cooney Joana Cavadas Liam Madigan Bradley Savage Rachel Heyburn Mairead O'Cuinn 11 0 0 01 Jun 2023
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding Yi Tu Ya Guo Huan Chen Jinyang Tang 31 15 0 30 May 2023
GVdoc: Graph-based Visual Document Classification Fnu Mohbat Mohammed J Zaki Catherine Finegan-Dollak Ashish Verma OOD 26 1 0 26 May 2023
Batch Model Consolidation: A Multi-Task Model Consolidation Framework Iordanis Fostiropoulos Jiaye Zhu Laurent Itti MoMe CLL 29 3 0 25 May 2023
RE $^2$ : Region-Aware Relation Extraction from Visually Rich Documents Pritika Ramu Sijia Wang Lalla Mouatadid Joy Rimchala Lifu Huang 38 0 0 24 May 2023
DUBLIN -- Document Understanding By Language-Image Network Kriti Aggarwal Aditi Khandelwal Kumar Tanmay Owais Mohammed Khan Qiang Liu Monojit Choudhury Hardik Hansrajbhai Chauhan Subhojit Som Vishrav Chaudhary Saurabh Tiwary ObjD VLM 39 0 0 23 May 2023
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding Mingliang Zhai Yulin Li Xiameng Qin Chen Yi Qunyi Xie Chengquan Zhang Kun Yao Yuwei Wu Yunde Jia 35 8 0 19 May 2023
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding ShuWei Feng Tianyang Zhan Zhanming Jie Trung Quoc Luong Xiaoran Jin 21 1 0 16 May 2023
Document Understanding Dataset and Evaluation (DUDE) Jordy Van Landeghem Rubèn Pérez Tito Łukasz Borchmann Michal Pietruszka Pawel Józiak ... Bertrand Ackaert Ernest Valveny Matthew Blaschko Sien Moens Tomasz Stanislawek VGen 24 53 0 15 May 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification Souhail Bakkali Zuheng Ming Mickael Coustaty Marçal Rusiñol 10 6 0 11 May 2023
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation Ayan Banerjee Sanket Biswas Josep Lladós Umapada Pal ViT 20 16 0 08 May 2023
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation Subhajit Maity Sanket Biswas Siladittya Manna Ayan Banerjee Josep Lladós Saumik Bhattacharya Umapada Pal 36 5 0 01 May 2023
CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data M. Turski Tomasz Stanislawek Karol Kaczmarek Pawel Dyda Filip Graliñski 33 11 0 28 Apr 2023
Information Redundancy and Biases in Public Document Information Extraction Benchmarks S. Laatiri Pirashanth Ratnamogan Joel Tang Laurent Lam William Vanhuffel Fabien Caspani 33 1 0 28 Apr 2023
Evaluating Adversarial Robustness on Document Image Classification Timothée Fronteau Arnaud Paran A. Shabou AAML 34 2 0 24 Apr 2023
A Question-Answering Approach to Key Value Pair Extraction from Form-like Document Images Kai Hu Zhuoyuan Wu Zhuoyao Zhong Weihong Lin Lei-huan Sun Qiang Huo 26 11 0 17 Apr 2023
Context-Aware Classification of Legal Document Pages Pavlos Fragkogiannis Martina Forster Grace E. Lee Dell Zhang 24 5 0 05 Apr 2023
ShabbyPages: A Reproducible Document Denoising and Binarization Dataset Alexander Groleau Kok Wei Chee Stefan Larson Samay Maini Jonathan Boarman 22 2 0 16 Mar 2023
Cross-Modal Entity Matching for Visually Rich Documents Ritesh Sarkhel Arnab Nandi 19 3 0 01 Mar 2023
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training Yu Yu Yulin Li Chengquan Zhang Xiaoqiang Zhang Zengyuan Guo Xiameng Qin Kun Yao Junyu Han Errui Ding Jingdong Wang 16 45 0 01 Mar 2023
Entry Separation using a Mixed Visual and Textual Language Model: Application to 19th century French Trade Directories Bertrand Duménieu Edwin Carlinet N. Abadie Joseph Chazalon 29 0 0 17 Feb 2023
DocILE Benchmark for Document Information Localization and Extraction vStvepán vSimsa Milan vSulc Michal Uvrivcávr Yash J. Patel Ahmed Hamdi ... Matyávs Skalický Jivrí Matas Antoine Doucet Mickael Coustaty Dimosthenis Karatzas 24 34 0 11 Feb 2023
DocILE 2023 Teaser: Document Information Localization and Extraction vStvepán vSimsa Milan vSulc Matyávs Skalický Yash J. Patel Ahmed Hamdi 31 2 0 29 Jan 2023
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization Laura Nguyen Thomas Scialom Benjamin Piwowarski Jacopo Staiano 27 7 0 26 Jan 2023
Multimodal Side-Tuning for Document Classification S. P. Zingaro G. Lisanti M. Gabbrielli 26 6 0 16 Jan 2023
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding Haoli Bai Zhiguang Liu Xiaojun Meng Wentao Li Shuangning Liu ... Liangwei Wang Lu Hou Jiansheng Wei Xin Jiang Qun Liu ViT 35 11 0 19 Dec 2022
Unifying Vision, Text, and Layout for Universal Document Processing Zineng Tang Ziyi Yang Guoxin Wang Yuwei Fang Yang Liu Chenguang Zhu Michael Zeng Chao-Yue Zhang Joey Tianyi Zhou VLM 32 106 0 05 Dec 2022
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding Zilong Wang Jiuxiang Gu Chris Tensmeyer Nikolaos Barmpalios A. Nenkova Tong Sun Jingbo Shang Vlad I. Morariu VLM 17 12 0 27 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models Lei Wang Jian He Xingdong Xu Ning Liu Hui-juan Liu 39 2 0 27 Nov 2022
Deep learning for table detection and structure recognition: A survey M. Kasem Abdelrahman Abdallah Alexander Berendeyev Ebrahem Elkady Mahmoud Abdalla Mohamed Mahmoud Mohamed Hamada D. Nurseitov I. Taj-Eddin LMTD 35 25 0 15 Nov 2022
VRDU: A Benchmark for Visually-rich Document Understanding Zilong Wang Yichao Zhou Wei Wei Chen-Yu Lee Sandeep Tata 22 15 0 15 Nov 2022
Privacy Meets Explainability: A Comprehensive Impact Benchmark S. Saifullah Dominique Mercier Adriano Lucieri Andreas Dengel Sheraz Ahmed 35 14 0 08 Nov 2022
On Web-based Visual Corpus Construction for Visual Document Understanding Donghyun Kim Teakgyu Hong Moonbin Yim Yoonsik Kim Geewook Kim 34 3 0 07 Nov 2022
Evaluating Out-of-Distribution Performance on Document Image Classifiers Stefan Larson Gordon Lim Yutong Ai David Kuang Kevin Leach OODD OOD 37 18 0 14 Oct 2022
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding Qiming Peng Yinxu Pan Wenjin Wang Bin Luo Zhenyu Zhang ... Shi Feng Yu Sun Hao Tian Hua Wu Haifeng Wang 13 83 0 12 Oct 2022
Class-wise and reduced calibration methods Michael Panchenko Anes Benmerzoug Miguel de Benito Delgado 21 0 0 07 Oct 2022