Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.09621
Cited By
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding
19 December 2022
Haoli Bai
Zhiguang Liu
Xiaojun Meng
Wentao Li
Shuangning Liu
Nian Xie
Rongfu Zheng
Liangwei Wang
Lu Hou
Jiansheng Wei
Xin Jiang
Qun Liu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding"
3 / 3 papers shown
Title
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,124
0
28 Jan 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
145
498
0
29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
122
355
0
27 May 2019
1