Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
1511.07571
Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DenseCap: Fully Convolutional Localization Networks for Dense Captioning"
50 / 468 papers shown
Title
Anomaly Detection in Video Sequence with Appearance-Motion Correspondence
IEEE International Conference on Computer Vision (ICCV), 2019
Trong-Nguyen Nguyen
J. Meunier
203
399
0
17 Aug 2019
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
IEEE International Conference on Computer Vision (ICCV), 2019
Badri N. Patro
Mayank Lunayach
Shivansh Patel
Vinay P. Namboodiri
FAtt
UQCV
275
77
0
17 Aug 2019
Survey on Deep Neural Networks in Speech and Vision Systems
M. Alam
Manar D. Samad
Lasitha Vidyaratne
Alexander M. Glandon
Khan M. Iftekharuddin
3DV
VLM
AI4TS
349
223
0
16 Aug 2019
Image Captioning using Facial Expression and Attention
Journal of Artificial Intelligence Research (JAIR), 2019
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
CVBM
154
11
0
08 Aug 2019
Addressing Data Bias Problems for Chest X-ray Image Report Generation
British Machine Vision Conference (BMVC), 2019
Philipp Harzig
Yan-Ying Chen
Francine Chen
Rainer Lienhart
MedIm
140
55
0
06 Aug 2019
Logic could be learned from images
International Journal of Machine Learning and Cybernetics (IJMLC), 2019
Q. Guo
Y. Qian
Xinyan Liang
Yanhong She
Deyu Li
Jiye Liang
NAI
172
4
0
06 Aug 2019
Cascaded Revision Network for Novel Object Captioning
Qianyu Feng
Yu Wu
Hehe Fan
C. Yan
Yezhou Yang
109
38
0
06 Aug 2019
Prediction and Description of Near-Future Activities in Video
Computer Vision and Image Understanding (CVIU), 2019
T. Mahmud
Mohammad Billah
Mahmudul Hasan
Amit K. Roy-Chowdhury
339
17
0
02 Aug 2019
Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation
ACM Multimedia (ACM MM), 2019
Yadan Luo
Zi Huang
Zheng Zhang
Ziwei Wang
Jingjing Li
Yang Yang
99
40
0
01 Aug 2019
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences
ACM Multimedia (ACM MM), 2019
Zhizhong Han
Chao Chen
Yu-Shen Liu
Matthias Zwicker
3DPC
181
50
0
31 Jul 2019
Real-time Visual Object Tracking with Natural Language Description
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
Qi Feng
Vitaly Ablavsky
Qinxun Bai
Guorong Li
Stan Sclaroff
VLM
ObjD
VOT
268
66
0
26 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Journal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
396
142
0
22 Jul 2019
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding
Jian Zheng
S. Krishnamurthy
Ruxin Chen
Min-Hung Chen
Zhenhao Ge
Xiaohua Li
127
4
0
16 Jun 2019
Speeding up VP9 Intra Encoder with Hierarchical Deep Learning Based Partition Prediction
IEEE Transactions on Image Processing (TIP), 2019
Somdyuti Paul
A. Norkin
A. Bovik
127
15
0
15 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Hyounghun Kim
Joey Tianyi Zhou
CoGe
106
21
0
14 Jun 2019
Image Captioning: Transforming Objects into Words
Neural Information Processing Systems (NeurIPS), 2019
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
ViT
416
544
0
14 Jun 2019
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
IEEE International Conference on Computer Vision (ICCV), 2019
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
500
1,357
0
07 Jun 2019
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Zhengjun Zha
Daqing Liu
Hanwang Zhang
Yongdong Zhang
Feng Wu
140
133
0
06 Jun 2019
Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation
Zih-Siou Hung
Arun Mallya
Svetlana Lazebnik
ViT
182
15
0
28 May 2019
Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding
Pattern Recognition Letters (PR), 2019
Arka Ujjal Dey
Suman K. Ghosh
Ernest Valveny
Gaurav Harit
189
26
0
25 May 2019
AttentionRNN: A Structured Spatial Attention Mechanism
IEEE International Conference on Computer Vision (ICCV), 2019
Siddhesh Khandelwal
Leonid Sigal
173
3
0
22 May 2019
Joint Object and State Recognition using Language Knowledge
International Conference on Information Photonics (ICIP), 2019
Ahmad Babaeian Jelodar
Yu Sun
161
18
0
13 May 2019
Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting for Marketing
Conference on Multimedia Information Processing and Retrieval (MIPR), 2019
Philipp Harzig
D. Zecha
Rainer Lienhart
Carolin Kaiser
René Schallner
74
3
0
06 May 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
446
773
0
26 Apr 2019
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
191
42
0
19 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
Alex Schwing
Tamir Hazan
186
79
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
216
119
0
11 Apr 2019
Modularized Textual Grounding for Counterfactual Resilience
Zhiyuan Fang
Shu Kong
Charless C. Fowlkes
Yezhou Yang
185
33
0
07 Apr 2019
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
180
18
0
04 Apr 2019
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
131
83
0
02 Apr 2019
Recurrent Back-Projection Network for Video Super-Resolution
Muhammad Haris
Gregory Shakhnarovich
Norimichi Ukita
SupR
157
473
0
25 Mar 2019
Neural Sequential Phrase Grounding (SeqGROUND)
Computer Vision and Pattern Recognition (CVPR), 2019
Pelin Dogan
Leonid Sigal
Markus Gross
ObjD
191
54
0
18 Mar 2019
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
254
92
0
14 Mar 2019
Learning To Follow Directions in Street View
AAAI Conference on Artificial Intelligence (AAAI), 2019
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
266
73
0
01 Mar 2019
CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional Neural Networks
Xinrui Cui
Dan Wang
F. I. Z. Jane Wang
FAtt
BDL
149
13
0
07 Feb 2019
Linearized Multi-Sampling for Differentiable Image Transformation
Wei Jiang
Weiwei Sun
Andrea Tagliasacchi
Eduard Trulls
K. M. Yi
217
24
0
22 Jan 2019
LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators
Jianan Li
Jimei Yang
Aaron Hertzmann
Jianming Zhang
Tingfa Xu
GAN
296
261
0
21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
328
346
0
20 Jan 2019
Toward Explainable Fashion Recommendation
Pongsate Tangseng
Takayuki Okatani
145
33
0
15 Jan 2019
Epipolar Geometry based Learning of Multi-view Depth and Ego-Motion from Monocular Sequences
V. Prasad
Dipanjan Das
Brojeshwar Bhowmick
MDE
205
9
0
23 Dec 2018
SfMLearner++: Learning Monocular Depth & Ego-Motion using Meaningful Geometric Constraints
V. Prasad
Brojeshwar Bhowmick
MDE
171
26
0
20 Dec 2018
Detecting unseen visual relations using analogies
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
134
18
0
13 Dec 2018
Visual Social Relationship Recognition
Junnan Li
Yongkang Wong
Qi Zhao
Mohan Kankanhalli
111
28
0
13 Dec 2018
Coarse-to-fine: A RNN-based hierarchical attention model for vehicle re-identification
Xiu-Shen Wei
Chen-Da Liu-Zhang
Lingqiao Liu
Chunhua Shen
Jianxin Wu
174
44
0
11 Dec 2018
Neural Word Search in Historical Manuscript Collections
T. Wilkinson
Jonas Lindström
Anders Brun
3DV
123
9
0
06 Dec 2018
Interactive Full Image Segmentation by Considering All Regions Jointly
E. Agustsson
J. Uijlings
V. Ferrari
VLM
241
77
0
05 Dec 2018
Visual Question Answering as Reading Comprehension
Hui Li
Peng Wang
Chunhua Shen
Anton Van Den Hengel
124
46
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
144
86
0
28 Nov 2018
MIST: Multiple Instance Spatial Transformer Network
Baptiste Angles
Shahram Izadi
Simon Kornblith
Andrea Tagliasacchi
K. M. Yi
331
5
0
26 Nov 2018
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
227
194
0
26 Nov 2018
Previous
1
2
3
...
10
5
6
7
8
9
Next