Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1908.06066
Cited By
v1
v2
v3 (latest)
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
AAAI Conference on Artificial Intelligence (AAAI), 2019
16 August 2019
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSL
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training"
18 / 518 papers shown
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Computer Vision and Pattern Recognition (CVPR), 2020
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
305
326
0
25 Feb 2020
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao Luo
Lei Ji
Ding Wang
Haoyang Huang
Nan Duan
Tianrui Li
Jason Li
Xilin Chen
Ming Zhou
VLM
375
417
0
15 Feb 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
374
276
0
22 Jan 2020
All-in-One Image-Grounded Conversational Agents
Da Ju
Kurt Shuster
Y-Lan Boureau
Jason Weston
LLMAG
147
9
0
28 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
European Conference on Computer Vision (ECCV), 2019
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
349
120
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
314
499
0
05 Dec 2019
Learning to Learn Words from Visual Scenes
Dídac Surís
Dave Epstein
Heng Ji
Shih-Fu Chang
Carl Vondrick
VLM
CLIP
SSL
OffRL
186
4
0
25 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Computer Vision and Pattern Recognition (CVPR), 2019
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
355
224
0
14 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
IEEE Journal on Selected Topics in Signal Processing (JSTSP), 2019
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
319
401
0
10 Nov 2019
Probing Contextualized Sentence Representations with Visual Awareness
Zhuosheng Zhang
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
Hai Zhao
229
2
0
07 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRM
ReLM
300
10
0
31 Oct 2019
UNITER: UNiversal Image-TExt Representation Learning
European Conference on Computer Vision (ECCV), 2019
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
345
464
0
25 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA
AAAI Conference on Artificial Intelligence (AAAI), 2019
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
692
1,008
0
24 Sep 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
International Conference on Learning Representations (ICLR), 2019
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
637
1,797
0
22 Aug 2019
Fusion of Detected Objects in Text for Visual Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Chris Alberti
Jeffrey Ling
Michael Collins
David Reitter
251
181
0
14 Aug 2019
CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Difei Gao
Ruiping Wang
Shiguang Shan
Xilin Chen
CoGe
LRM
303
37
0
08 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Neural Information Processing Systems (NeurIPS), 2019
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
911
4,211
0
06 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Journal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
404
142
0
22 Jul 2019
Previous
1
2
3
...
10
11
9