Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.08965
Cited By
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling
20 August 2021
Xiaopeng Lu
Zhenhua Fan
Yansen Wang
Jean Oh
Carolyn Rose
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling"
15 / 15 papers shown
Title
Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs
Fengzhu Zeng
Wenqian Li
Wei Gao
Yan Pang
34
2
0
29 Sep 2024
Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answering
Zhixuan Shen
Haonan Luo
Sijia Li
Tianrui Li
19
0
0
14 Mar 2024
Multiple-Question Multiple-Answer Text-VQA
Peng Tang
Srikar Appalaraju
R. Manmatha
Yusheng Xie
Vijay Mahadevan
44
5
0
15 Nov 2023
Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA
Sheng Zhou
Dan Guo
Jia Li
Xun Yang
M. Wang
8
5
0
13 Oct 2023
Separate and Locate: Rethink the Text in Text-based Visual Question Answering
Chengyang Fang
Jiangnan Li
Liang Li
Can Ma
Dayong Hu
9
12
0
31 Aug 2023
Making the V in Text-VQA Matter
Shamanthak Hegde
Soumya Jahagirdar
Shankar Gangisetty
CoGe
20
4
0
01 Aug 2023
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju
Peng Tang
Qi Dong
Nishant Sankaran
Yichu Zhou
R. Manmatha
19
39
0
02 Jun 2023
Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA
Yongxin Zhu
Z. Liu
Yukang Liang
Xin Li
Hao Liu
Changcun Bao
Linli Xu
16
6
0
04 Apr 2023
Towards Models that Can See and Read
Roy Ganz
Oren Nuriel
Aviad Aberdam
Yair Kittenplon
Shai Mazor
Ron Litman
14
13
0
18 Jan 2023
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Feiqi Cao
Siwen Luo
F. Núñez
Zean Wen
Josiah Poon
Caren Han
GNN
11
4
0
16 Dec 2022
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
27
20
0
21 Sep 2022
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
125
29
0
12 Sep 2022
TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Jun Wang
M. Gao
Yuqian Hu
Ramprasaath R. Selvaraju
Chetan Ramaiah
Ran Xu
J. JáJá
Larry S. Davis
ViT
12
17
0
03 Aug 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
19
100
0
23 Dec 2021
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
Andreas Veit
Tomas Matera
Lukás Neumann
Jirí Matas
Serge J. Belongie
175
515
0
26 Jan 2016
1