Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.06706
Cited By
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
20 January 2019
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual Entailment: A Novel Task for Fine-Grained Image Understanding"
29 / 229 papers shown
Title
Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing
Sarah Wiegreffe
Ana Marasović
XAI
11
141
0
24 Feb 2021
UniT: Multimodal Multitask Learning with a Unified Transformer
Ronghang Hu
Amanpreet Singh
ViT
6
294
0
22 Feb 2021
Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge
Violetta Shevchenko
Damien Teney
A. Dick
A. Hengel
6
28
0
15 Jan 2021
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
Woojeong Jin
Maziar Sanjabi
Shaoliang Nie
L Tan
Xiang Ren
Hamed Firooz
11
6
0
06 Jan 2021
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua-Hong Wu
Haifeng Wang
12
374
0
31 Dec 2020
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs
Emanuele Bugliarello
Ryan Cotterell
Naoaki Okazaki
Desmond Elliott
22
119
0
30 Nov 2020
Transformation Driven Visual Reasoning
Xin Hong
Yanyan Lan
Liang Pang
Jiafeng Guo
Xueqi Cheng
LRM
13
21
0
26 Nov 2020
Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
Ana Marasović
Chandra Bhagavatula
J. S. Park
Ronan Le Bras
Noah A. Smith
Yejin Choi
ReLM
LRM
10
61
0
15 Oct 2020
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition
Liangzhi Li
Bowen Wang
Manisha Verma
Yuta Nakashima
R. Kawasaki
Hajime Nagahara
OCL
18
46
0
14 Sep 2020
Training Multimodal Systems for Classification with Multiple Objectives
Jason Armitage
Shramana Thakur
Rishi Tripathi
Jens Lehmann
M. Maleshkova
9
1
0
26 Aug 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
14
485
0
11 Jun 2020
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao
Zhe Gan
Yu Cheng
Licheng Yu
Yen-Chun Chen
Jingjing Liu
VLM
12
127
0
15 May 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
21
575
0
10 May 2020
Visuo-Linguistic Question Answering (VLQA) Challenge
Shailaja Keyur Sampat
Yezhou Yang
Chitta Baral
CoGe
6
1
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
22
489
0
01 May 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAML
XAI
29
366
0
30 Apr 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Shafiq R. Joty
Michael R. Lyu
Irwin King
Caiming Xiong
S. Hoi
14
102
0
28 Apr 2020
e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language Explanations
Virginie Do
Oana-Maria Camburu
Zeynep Akata
Thomas Lukasiewicz
LRM
11
30
0
07 Apr 2020
Evaluating Multimodal Representations on Visual Semantic Textual Similarity
Oier López de Lacalle
Ander Salaberria
Aitor Soroa Etxabe
Gorka Azkune
Eneko Agirre
4
2
0
04 Apr 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLM
VGen
25
67
0
25 Mar 2020
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
19
115
0
05 Dec 2019
On Architectures for Including Visual Information in Neural Language Models for Image Description
Marc Tanti
Albert Gatt
K. Camilleri
VLM
19
2
0
09 Nov 2019
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
29
442
0
25 Sep 2019
Visuallly Grounded Generation of Entailments from Premises
Somayeh Jafaritazehjani
Albert Gatt
Marc Tanti
LRM
14
1
0
21 Sep 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
13
132
0
22 Jul 2019
Self-Critical Reasoning for Robust Visual Question Answering
Jialin Wu
Raymond J. Mooney
OOD
NAI
11
155
0
24 May 2019
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
9
583
0
01 Nov 2018
Visual Translation Embedding Network for Visual Relation Detection
Hanwang Zhang
Zawlin Kyaw
Shih-Fu Chang
Tat-Seng Chua
ViT
140
559
0
27 Feb 2017
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,458
0
06 Jun 2016
Previous
1
2
3
4
5