ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.06706
  4. Cited By
Visual Entailment: A Novel Task for Fine-Grained Image Understanding

Visual Entailment: A Novel Task for Fine-Grained Image Understanding

20 January 2019
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
    CoGe
ArXivPDFHTML

Papers citing "Visual Entailment: A Novel Task for Fine-Grained Image Understanding"

29 / 229 papers shown
Title
Teach Me to Explain: A Review of Datasets for Explainable Natural
  Language Processing
Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing
Sarah Wiegreffe
Ana Marasović
XAI
11
141
0
24 Feb 2021
UniT: Multimodal Multitask Learning with a Unified Transformer
UniT: Multimodal Multitask Learning with a Unified Transformer
Ronghang Hu
Amanpreet Singh
ViT
6
294
0
22 Feb 2021
Reasoning over Vision and Language: Exploring the Benefits of
  Supplemental Knowledge
Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge
Violetta Shevchenko
Damien Teney
A. Dick
A. Hengel
6
28
0
15 Jan 2021
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
Woojeong Jin
Maziar Sanjabi
Shaoliang Nie
L Tan
Xiang Ren
Hamed Firooz
11
6
0
06 Jan 2021
UNIMO: Towards Unified-Modal Understanding and Generation via
  Cross-Modal Contrastive Learning
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua-Hong Wu
Haifeng Wang
12
374
0
31 Dec 2020
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework
  of Vision-and-Language BERTs
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs
Emanuele Bugliarello
Ryan Cotterell
Naoaki Okazaki
Desmond Elliott
22
119
0
30 Nov 2020
Transformation Driven Visual Reasoning
Transformation Driven Visual Reasoning
Xin Hong
Yanyan Lan
Liang Pang
Jiafeng Guo
Xueqi Cheng
LRM
13
21
0
26 Nov 2020
Natural Language Rationales with Full-Stack Visual Reasoning: From
  Pixels to Semantic Frames to Commonsense Graphs
Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
Ana Marasović
Chandra Bhagavatula
J. S. Park
Ronan Le Bras
Noah A. Smith
Yejin Choi
ReLM
LRM
10
61
0
15 Oct 2020
SCOUTER: Slot Attention-based Classifier for Explainable Image
  Recognition
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition
Liangzhi Li
Bowen Wang
Manisha Verma
Yuta Nakashima
R. Kawasaki
Hajime Nagahara
OCL
18
46
0
14 Sep 2020
Training Multimodal Systems for Classification with Multiple Objectives
Training Multimodal Systems for Classification with Multiple Objectives
Jason Armitage
Shramana Thakur
Rishi Tripathi
Jens Lehmann
M. Maleshkova
9
1
0
26 Aug 2020
Large-Scale Adversarial Training for Vision-and-Language Representation
  Learning
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
14
485
0
11 Jun 2020
Behind the Scene: Revealing the Secrets of Pre-trained
  Vision-and-Language Models
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao
Zhe Gan
Yu Cheng
Licheng Yu
Yen-Chun Chen
Jingjing Liu
VLM
12
127
0
15 May 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
21
575
0
10 May 2020
Visuo-Linguistic Question Answering (VLQA) Challenge
Visuo-Linguistic Question Answering (VLQA) Challenge
Shailaja Keyur Sampat
Yezhou Yang
Chitta Baral
CoGe
6
1
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
22
489
0
01 May 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAML
XAI
29
366
0
30 Apr 2020
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VD-BERT: A Unified Vision and Dialog Transformer with BERT
Yue Wang
Shafiq R. Joty
Michael R. Lyu
Irwin King
Caiming Xiong
S. Hoi
14
102
0
28 Apr 2020
e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language
  Explanations
e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language Explanations
Virginie Do
Oana-Maria Camburu
Zeynep Akata
Thomas Lukasiewicz
LRM
11
30
0
07 Apr 2020
Evaluating Multimodal Representations on Visual Semantic Textual
  Similarity
Evaluating Multimodal Representations on Visual Semantic Textual Similarity
Oier López de Lacalle
Ander Salaberria
Aitor Soroa Etxabe
Gorka Azkune
Eneko Agirre
4
2
0
04 Apr 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLM
VGen
25
67
0
25 Mar 2020
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
19
115
0
05 Dec 2019
On Architectures for Including Visual Information in Neural Language
  Models for Image Description
On Architectures for Including Visual Information in Neural Language Models for Image Description
Marc Tanti
Albert Gatt
K. Camilleri
VLM
19
2
0
09 Nov 2019
UNITER: UNiversal Image-TExt Representation Learning
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
29
442
0
25 Sep 2019
Visuallly Grounded Generation of Entailments from Premises
Visuallly Grounded Generation of Entailments from Premises
Somayeh Jafaritazehjani
Albert Gatt
Marc Tanti
LRM
14
1
0
21 Sep 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
13
132
0
22 Jul 2019
Self-Critical Reasoning for Robust Visual Question Answering
Self-Critical Reasoning for Robust Visual Question Answering
Jialin Wu
Raymond J. Mooney
OOD
NAI
11
155
0
24 May 2019
A Corpus for Reasoning About Natural Language Grounded in Photographs
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
9
583
0
01 Nov 2018
Visual Translation Embedding Network for Visual Relation Detection
Visual Translation Embedding Network for Visual Relation Detection
Hanwang Zhang
Zawlin Kyaw
Shih-Fu Chang
Tat-Seng Chua
ViT
140
559
0
27 Feb 2017
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,458
0
06 Jun 2016
Previous
12345