Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2012.05153
Cited By
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps
AAAI Conference on Artificial Intelligence (AAAI), 2020
9 December 2020
Qi Zhu
Chenyu Gao
Peng Wang
Qi Wu
Re-assign community
ArXiv (abs)
PDF
HTML
Github (57★)
Papers citing
"Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps"
19 / 19 papers shown
Gather and Trace: Rethinking Video TextVQA from an Instance-oriented Perspective
Zhifei Yang
Gangyan Zeng
Daiqing Wu
Huawen Shen
B. Li
Can Ma
Can Ma
Xiaojun Bi
236
2
0
06 Aug 2025
Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA
Sheng Zhou
Dan Guo
Jia Li
Xun Yang
Ming Wang
342
24
0
13 Oct 2023
Image-Text Pre-Training for Logo Recognition
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Mark Hubenthal
Suren Kumar
VLM
221
6
0
18 Sep 2023
Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yongxin Zhu
Ziqiang Liu
Yukang Liang
Xin Li
Hao Liu
Changcun Bao
Linli Xu
210
10
0
04 Apr 2023
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image Captioning
Pattern Recognition (Pattern Recogn.), 2023
Dongsheng Xu
Qingbao Huang
Shuang Feng
Yiru Cai
Feng Shuang
Yi Cai
ViT
VLM
610
1
0
03 Feb 2023
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Feiqi Cao
Siwen Luo
F. Núñez
Zean Wen
Josiah Poon
Caren Han
GNN
461
6
0
16 Dec 2022
Text-Aware Dual Routing Network for Visual Question Answering
Luoqian Jiang
Yifan He
Jian Chen
155
0
0
17 Nov 2022
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
IEEE Transactions on Image Processing (IEEE TIP), 2022
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
436
30
0
21 Sep 2022
MUST-VQA: MUltilingual Scene-text VQA
Emanuele Vivoli
Ali Furkan Biten
Andrés Mafla
Dimosthenis Karatzas
Lluís Gómez
302
8
0
14 Sep 2022
TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
British Machine Vision Conference (BMVC), 2022
Jun Wang
M. Gao
Yuqian Hu
Ramprasaath R. Selvaraju
Chetan Ramaiah
Ran Xu
Joseph Jaja
Larry S. Davis
ViT
279
23
0
03 Aug 2022
One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning
Neurocomputing (Neurocomputing), 2022
Zhipeng Zhang
Zhimin Wei
Zhongzhen Huang
Rui Niu
Peng Wang
ObjD
LRM
354
11
0
31 Jul 2022
Towards Multimodal Vision-Language Models Generating Non-Generic Text
ICON (ICON), 2022
Wes Robbins
Zanyar Zohourianshahzadi
Jugal Kalita
219
1
0
09 Jul 2022
ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Computer Vision and Pattern Recognition (CVPR), 2022
Mengjun Cheng
Yipeng Sun
Long Wang
Xiongwei Zhu
Kun Yao
...
Guoli Song
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
392
77
0
31 Mar 2022
Towards Escaping from Language Bias and OCR Error: Semantics-Centered Text Visual Question Answering
IEEE International Conference on Multimedia and Expo (ICME), 2022
Chengyang Fang
Gangyan Zeng
Can Ma
Daiqing Wu
Can Ma
Dayong Hu
Weiping Wang
171
10
0
24 Mar 2022
LaTr: Layout-Aware Transformer for Scene-Text VQA
Computer Vision and Pattern Recognition (CVPR), 2021
Ali Furkan Biten
Ron Litman
Yusheng Xie
Srikar Appalaraju
R. Manmatha
ViT
478
118
0
23 Dec 2021
ICDAR 2021 Competition on Document VisualQuestion Answering
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2021
Rubèn Pérez Tito
Minesh Mathew
C. V. Jawahar
Ernest Valveny
Dimosthenis Karatzas
254
32
0
10 Nov 2021
Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling
Xiaopeng Lu
Zhenhua Fan
Yansen Wang
Jean Oh
Carolyn Rose
226
32
0
20 Aug 2021
Question-controlled Text-aware Image Captioning
ACM Multimedia (ACM MM), 2021
Anwen Hu
Shizhe Chen
Qin Jin
221
16
0
04 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
585
373
0
14 Jul 2021
1
Page 1 of 1