Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.07571
Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DenseCap: Fully Convolutional Localization Networks for Dense Captioning"
50 / 452 papers shown
Title
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
25
1,172
0
07 Jun 2019
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zhengjun Zha
Daqing Liu
Hanwang Zhang
Yongdong Zhang
Feng Wu
12
119
0
06 Jun 2019
Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation
Zih-Siou Hung
Arun Mallya
Svetlana Lazebnik
ViT
12
14
0
28 May 2019
Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding
Arka Ujjal Dey
Suman K. Ghosh
Ernest Valveny
Gaurav Harit
25
23
0
25 May 2019
AttentionRNN: A Structured Spatial Attention Mechanism
Siddhesh Khandelwal
Leonid Sigal
8
3
0
22 May 2019
Joint Object and State Recognition using Language Knowledge
Ahmad Babaeian Jelodar
Yu Sun
20
18
0
13 May 2019
Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting for Marketing
Philipp Harzig
D. Zecha
Rainer Lienhart
Carolin Kaiser
René Schallner
14
2
0
06 May 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
8
685
0
26 Apr 2019
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
14
41
0
19 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
19
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
30
117
0
11 Apr 2019
Modularized Textual Grounding for Counterfactual Resilience
Zhiyuan Fang
Shu Kong
Charless C. Fowlkes
Yezhou Yang
12
32
0
07 Apr 2019
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
16
18
0
04 Apr 2019
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
16
75
0
02 Apr 2019
Recurrent Back-Projection Network for Video Super-Resolution
Muhammad Haris
Gregory Shakhnarovich
Norimichi Ukita
SupR
20
430
0
25 Mar 2019
Neural Sequential Phrase Grounding (SeqGROUND)
Pelin Dogan
Leonid Sigal
Markus Gross
ObjD
16
51
0
18 Mar 2019
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
6
84
0
14 Mar 2019
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
16
66
0
01 Mar 2019
CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional Neural Networks
Xinrui Cui
Dan Wang
F. I. Z. Jane Wang
FAtt
BDL
14
12
0
07 Feb 2019
Linearized Multi-Sampling for Differentiable Image Transformation
Wei Jiang
Weiwei Sun
Andrea Tagliasacchi
Eduard Trulls
K. M. Yi
25
23
0
22 Jan 2019
LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators
Jianan Li
Jimei Yang
Aaron Hertzmann
Jianming Zhang
Tingfa Xu
GAN
13
226
0
21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
31
321
0
20 Jan 2019
Toward Explainable Fashion Recommendation
Pongsate Tangseng
Takayuki Okatani
8
29
0
15 Jan 2019
Epipolar Geometry based Learning of Multi-view Depth and Ego-Motion from Monocular Sequences
V. Prasad
Dipanjan Das
Brojeshwar Bhowmick
MDE
15
9
0
23 Dec 2018
SfMLearner++: Learning Monocular Depth & Ego-Motion using Meaningful Geometric Constraints
V. Prasad
Brojeshwar Bhowmick
MDE
20
26
0
20 Dec 2018
Detecting unseen visual relations using analogies
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
11
18
0
13 Dec 2018
Visual Social Relationship Recognition
Junnan Li
Yongkang Wong
Qi Zhao
Mohan S. Kankanhalli
33
27
0
13 Dec 2018
Coarse-to-fine: A RNN-based hierarchical attention model for vehicle re-identification
Xiu-Shen Wei
Chen-Da Liu-Zhang
Lingqiao Liu
Chunhua Shen
Jianxin Wu
6
43
0
11 Dec 2018
Neural Word Search in Historical Manuscript Collections
T. Wilkinson
Jonas Lindström
Anders Brun
3DV
20
8
0
06 Dec 2018
Interactive Full Image Segmentation by Considering All Regions Jointly
E. Agustsson
J. Uijlings
V. Ferrari
VLM
10
74
0
05 Dec 2018
Visual Question Answering as Reading Comprehension
Hui Li
Peng Wang
Chunhua Shen
A. Hengel
9
40
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
12
81
0
28 Nov 2018
MIST: Multiple Instance Spatial Transformer Network
Baptiste Angles
Shahram Izadi
Simon Kornblith
Andrea Tagliasacchi
K. M. Yi
17
5
0
26 Nov 2018
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
18
175
0
26 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
15
53
0
26 Nov 2018
Senti-Attend: Image Captioning using Sentiment and Attention
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
VLM
11
15
0
24 Nov 2018
Object-oriented Targets for Visual Navigation using Rich Semantic Representations
Jean-Benoit Delbrouck
Stéphane Dupont
23
3
0
22 Nov 2018
Intention Oriented Image Captions with Guiding Objects
Yue Zheng
Yali Li
Shengjin Wang
11
55
0
19 Nov 2018
Revisiting Image-Language Networks for Open-ended Phrase Detection
Bryan A. Plummer
Kevin J. Shih
Yichen Li
Ke Xu
Svetlana Lazebnik
Stan Sclaroff
Kate Saenko
ObjD
SSeg
13
4
0
17 Nov 2018
Image Captioning as Neural Machine Translation Task in SOCKEYE
Loris Bazzani
Tobias Domhan
F. Hieber
VLM
17
2
0
09 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
28
760
0
06 Oct 2018
Team NimbRo at MBZIRC 2017: Autonomous Valve Stem Turning using a Wrench
Max Schwarz
David Droeschel
Christian Lenz
Arul Selvam Periyasamy
En Yen Puang
Jan Razlaw
Diego Rodriguez
Sebastian Schüller
M. Schreiber
Sven Behnke
6
16
0
06 Oct 2018
RGB-D Object Detection and Semantic Segmentation for Autonomous Manipulation in Clutter
Max Schwarz
Anton Milan
Arul Selvam Periyasamy
Sven Behnke
3DPC
19
162
0
01 Oct 2018
Vector Learning for Cross Domain Representations
Shagan Sah
Chi Zhang
Thang Nguyen
D. Peri
Ameya Shringi
R. Ptucha
GAN
11
3
0
27 Sep 2018
Object Detection from Scratch with Deep Supervision
Zhiqiang Shen
Zhuang Liu
Jianguo Li
Yu-Gang Jiang
Yurong Chen
Xiangyang Xue
ObjD
9
77
0
25 Sep 2018
Image Reassembly Combining Deep Learning and Shortest Path Problem
Marie-Morgane Paumard
David Picard
Hedi Tabia
OCL
3DV
19
24
0
04 Sep 2018
Diverse and Coherent Paragraph Generation from Images
Moitreya Chatterjee
A. Schwing
11
66
0
03 Sep 2018
Wavelet based edge feature enhancement for convolutional neural networks
Dedimuni D. De Silva
Subha Fernando
I. Piyatilake
A. Karunarathne
38
16
0
29 Aug 2018
Multimodal Differential Network for Visual Question Generation
Badri N. Patro
Sandeep Kumar
V. Kurmi
Vinay P. Namboodiri
13
41
0
12 Aug 2018
Community Regularization of Visually-Grounded Dialog
Akshat Agarwal
Swaminathan Gurumurthy
Vasu Sharma
M. Lewis
Katia P. Sycara
18
10
0
10 Aug 2018
Previous
1
2
3
...
10
5
6
7
8
9
Next