VisualBERT: A Simple and Performant Baseline for Vision and Language

9 August 2019

Papers citing "VisualBERT: A Simple and Performant Baseline for Vision and Language"

10 / 1,260 papers shown

UNITER: UNiversal Image-TExt Representation LearningEuropean Conference on Computer Vision (ECCV), 2019

374

465

25 Sep 2019

Unified Vision-Language Pre-Training for Image Captioning and VQAAAAI Conference on Artificial Intelligence (AAAI), 2019

Lei Zhang

699

1,016

24 Sep 2019

NLVR2 Visual Bias Analysis

Alane Suhr

Yoav Artzi

23 Sep 2019

Supervised Multimodal Bitransformers for Classifying Images and Text

Douwe Kiela

333

298

06 Sep 2019

VL-BERT: Pre-training of Generic Visual-Linguistic RepresentationsInternational Conference on Learning Representations (ICLR), 2019

Weijie Su

676

1,800

22 Aug 2019

LXMERT: Learning Cross-Modality Encoder Representations from TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

Hao Hao Tan

Joey Tianyi Zhou

VLM MLLM

789

2,787

20 Aug 2019

Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-trainingAAAI Conference on Artificial Intelligence (AAAI), 2019

804

948

16 Aug 2019

Fusion of Detected Objects in Text for Visual Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

260

182

14 Aug 2019

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language TasksNeural Information Processing Systems (NeurIPS), 2019

Devi Parikh

945

4,235

06 Aug 2019

An Attentive Survey of Attention Models

450

723

05 Apr 2019