Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.19128
Cited By
OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition
28 March 2024
Jianqiang Wan
Sibo Song
Wenwen Yu
Yuliang Liu
Wenqing Cheng
Fei Huang
Xiang Bai
Cong Yao
Zhibo Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition"
12 / 12 papers shown
Title
DocSpiral: A Platform for Integrated Assistive Document Annotation through Human-in-the-Spiral
Qiang Sun
Sirui Li
Tingting Bi
D. Huynh
Mark Reynolds
Yuanyi Luo
Wei Liu
25
0
0
06 May 2025
PixelWeb: The First Web GUI Dataset with Pixel-Wise Labels
Qi Yang
Weichen Bi
Haiyang Shen
Y. Guo
Yun Ma
32
0
0
23 Apr 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Mingyue Cheng
Yucong Luo
Jie Ouyang
Q. Liu
Huijie Liu
...
Bohou Zhang
Jiawei Cao
Jie Ma
Daoyu Wang
Enhong Chen
3DV
59
3
0
11 Mar 2025
SpiritSight Agent: Advanced GUI Agent with One Look
Zhiyuan Huang
Ziming Cheng
Junting Pan
Zhaohui Hou
Mingjie Zhan
LLMAG
83
2
0
05 Mar 2025
Enhancing Table Recognition with Vision LLMs: A Benchmark and Neighbor-Guided Toolchain Reasoner
Yitong Zhou
Mingyue Cheng
Qingyang Mao
Qi Liu
F. Xu
Xin Li
Enhong Chen
LMTD
37
0
0
30 Dec 2024
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
Jingqun Tang
Chunhui Lin
Zhen Zhao
Shubo Wei
Binghong Wu
...
Yuliang Liu
Hao Liu
Yuan Xie
Xiang Bai
Can Huang
LRM
VLM
MLLM
44
26
0
19 Apr 2024
Towards Unified Scene Text Spotting based on Sequence Generation
Taeho Kil
Seonghyeon Kim
Sukmin Seo
Yoon Kim
Daehee Kim
55
19
0
07 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
233
341
0
22 Sep 2021
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Yang Xu
Yiheng Xu
Tengchao Lv
Lei Cui
Furu Wei
...
D. Florêncio
Cha Zhang
Wanxiang Che
Min Zhang
Lidong Zhou
ViT
MLLM
137
492
0
29 Dec 2020
Convolutional Character Networks
Linjie Xing
Zhi Tian
Weilin Huang
Matthew R. Scott
40
155
0
17 Oct 2019
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
154
3,574
0
09 Dec 2016
1