ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.13530
  4. Cited By
Multimodal Pre-training Based on Graph Attention Network for Document
  Understanding

Multimodal Pre-training Based on Graph Attention Network for Document Understanding

25 March 2022
Zhenrong Zhang
Jiefeng Ma
Jun Du
Licheng Wang
Jianshu Zhang
ArXivPDFHTML

Papers citing "Multimodal Pre-training Based on Graph Attention Network for Document Understanding"

10 / 10 papers shown
Title
Document Image Rectification Bases on Self-Adaptive Multitask Fusion
Document Image Rectification Bases on Self-Adaptive Multitask Fusion
Heng Li
Xiangping Wu
Qingcai Chen
49
0
0
09 May 2025
DocMamba: Efficient Document Pre-training with State Space Model
DocMamba: Efficient Document Pre-training with State Space Model
Pengfei Hu
Zhenrong Zhang
Jiefeng Ma
Shuhang Liu
Jun Du
Jianshu Zhang
Mamba
42
1
0
18 Sep 2024
HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of
  Document Structures
HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures
Jiefeng Ma
Jun Du
Pengfei Hu
Zhenrong Zhang
Jianshu Zhang
Huihui Zhu
Cong Liu
27
15
0
24 Mar 2023
Split, embed and merge: An accurate table structure recognizer
Split, embed and merge: An accurate table structure recognizer
Zhenrong Zhang
Jianshu Zhang
Jun Du
LMTD
97
57
0
12 Jul 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
251
577
0
22 Apr 2021
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document
  Understanding
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
153
501
0
29 Dec 2020
Robust Character Labeling in Movie Videos: Data Resources and
  Self-supervised Feature Adaptation
Robust Character Labeling in Movie Videos: Data Resources and Self-supervised Feature Adaptation
Krishna Somandepalli
Rajat Hebbar
Shrikanth Narayanan
CVBM
32
5
0
25 Aug 2020
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
279
3,378
0
09 Mar 2020
Spatiotemporal Co-attention Recurrent Neural Networks for Human-Skeleton
  Motion Prediction
Spatiotemporal Co-attention Recurrent Neural Networks for Human-Skeleton Motion Prediction
Xiangbo Shu
Liyan Zhang
Guo-Jun Qi
Wei Liu
Jinhui Tang
3DH
HAI
44
203
0
29 Sep 2019
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
143
356
0
27 May 2019
1