Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1908.02265
Cited By
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Neural Information Processing Systems (NeurIPS), 2019
6 August 2019
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks"
31 / 2,231 papers shown
Title
Temporal Reasoning via Audio Question Answering
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2019
Haytham M. Fayek
Justin Johnson
140
60
0
21 Nov 2019
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
Computer Vision and Pattern Recognition (CVPR), 2019
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
LRM
361
264
0
18 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Computer Vision and Pattern Recognition (CVPR), 2019
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
289
222
0
14 Nov 2019
Unsupervised Pre-training for Natural Language Generation: A Literature Review
Yuanxin Liu
Zheng Lin
SSL
AI4CE
110
5
0
13 Nov 2019
The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design
J. Dean
152
83
0
13 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
IEEE Journal on Selected Topics in Signal Processing (JSTSP), 2019
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
275
396
0
10 Nov 2019
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li
Jing Jiang
126
0
0
10 Nov 2019
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Kurt Shuster
Da Ju
Stephen Roller
Emily Dinan
Y-Lan Boureau
Jason Weston
226
84
0
09 Nov 2019
Probing Contextualized Sentence Representations with Visual Awareness
Zhuosheng Zhang
Rui Wang
Kehai Chen
Masao Utiyama
Eiichiro Sumita
Hai Zhao
187
2
0
07 Nov 2019
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
Alex Schwing
LRM
ReLM
288
10
0
31 Oct 2019
Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution
Won Ik Cho
J. Cho
Woohyun Kang
N. Kim
218
2
0
21 Oct 2019
Meta Module Network for Compositional Visual Reasoning
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
Wenhu Chen
Zhe Gan
Linjie Li
Yu Cheng
Wenjie Wang
Jingjing Liu
LRM
289
75
0
08 Oct 2019
LoGAN: Latent Graph Co-Attention Network for Weakly-Supervised Video Moment Retrieval
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
Reuben Tan
Huijuan Xu
Kate Saenko
Bryan A. Plummer
252
74
0
27 Sep 2019
UNITER: UNiversal Image-TExt Representation Learning
European Conference on Computer Vision (ECCV), 2019
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
325
463
0
25 Sep 2019
Unified Vision-Language Pre-Training for Image Captioning and VQA
AAAI Conference on Artificial Intelligence (AAAI), 2019
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
601
1,005
0
24 Sep 2019
MULE: Multimodal Universal Language Embedding
AAAI Conference on Artificial Intelligence (AAAI), 2019
Donghyun Kim
Kuniaki Saito
Kate Saenko
Stan Sclaroff
Bryan A. Plummer
VLM
181
43
0
08 Sep 2019
Pretrained AI Models: Performativity, Mobility, and Change
Lav Varshney
N. Keskar
R. Socher
87
21
0
07 Sep 2019
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela
Suvrat Bhooshan
Hamed Firooz
Ethan Perez
Davide Testuggine
287
292
0
06 Sep 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
International Conference on Learning Representations (ICLR), 2019
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
565
1,789
0
22 Aug 2019
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Hao Hao Tan
Joey Tianyi Zhou
VLM
MLLM
649
2,740
0
20 Aug 2019
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
AAAI Conference on Artificial Intelligence (AAAI), 2019
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSL
VLM
MLLM
632
944
0
16 Aug 2019
Fusion of Detected Objects in Text for Visual Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Chris Alberti
Jeffrey Ling
Michael Collins
David Reitter
220
181
0
14 Aug 2019
Multi-modality Latent Interaction Network for Visual Question Answering
IEEE International Conference on Computer Vision (ICCV), 2019
Shiyang Feng
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Jiaming Song
147
85
0
10 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
538
2,181
0
09 Aug 2019
CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Difei Gao
Ruiping Wang
Shiguang Shan
Xilin Chen
CoGe
LRM
252
36
0
08 Aug 2019
Finding Moments in Video Collections Using Natural Language
Victor Escorcia
Mattia Soldan
Josef Sivic
Guohao Li
Bryan C. Russell
148
11
0
30 Jul 2019
Bilinear Graph Networks for Visual Question Answering
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019
Dalu Guo
Chang Xu
Dacheng Tao
GNN
169
66
0
23 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Journal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
376
141
0
22 Jul 2019
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
352
716
0
05 Apr 2019
VQA with no questions-answers training
Computer Vision and Pattern Recognition (CVPR), 2018
B. Vatashsky
S. Ullman
208
13
0
20 Nov 2018
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
992
3,754
0
02 Dec 2016
Previous
1
2
3
...
43
44
45