Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.03256
Cited By
On Web-based Visual Corpus Construction for Visual Document Understanding
7 November 2022
Donghyun Kim
Teakgyu Hong
Moonbin Yim
Yoonsik Kim
Geewook Kim
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Web-based Visual Corpus Construction for Visual Document Understanding"
6 / 6 papers shown
Title
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
Qingpei Guo
Kaiyou Song
Zipeng Feng
Ziping Ma
Qinglong Zhang
...
Yunxiao Sun
Tai-WeiChang
Jingdong Chen
Ming Yang
Jun Zhou
MLLM
VLM
82
3
0
26 Feb 2025
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIP
VLM
158
262
0
07 Oct 2022
Augraphy: A Data Augmentation Library for Document Images
Alexander Groleau
Kok Wei Chee
Stefan Larson
Samay Maini
Jonathan Boarman
10
10
0
30 Aug 2022
DOM-LM: Learning Generalizable Representations for HTML Documents
Xiang Deng
Prashant Shiralkar
Colin Lockard
Binxuan Huang
Huan Sun
AI4TS
AI4CE
37
37
0
25 Jan 2022
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
145
498
0
29 Dec 2020
FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Guillaume Jaume
H. K. Ekenel
Jean-Philippe Thiran
119
353
0
27 May 2019
1