ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.09621
  4. Cited By
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document
  Understanding

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

19 December 2022
Haoli Bai
Zhiguang Liu
Xiaojun Meng
Wentao Li
Shuangning Liu
Nian Xie
Rongfu Zheng
Liangwei Wang
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
    ViT
ArXivPDFHTML

Papers citing "Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding"

3 / 3 papers shown
Title
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,124
0
28 Jan 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document
  Understanding
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
145
498
0
29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
122
355
0
27 May 2019
1