Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.07571
Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DenseCap: Fully Convolutional Localization Networks for Dense Captioning"
50 / 452 papers shown
Title
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Dapeng Chen
Hongsheng Li
Xihui Liu
Yantao Shen
Zejian Yuan
Xiaogang Wang
20
134
0
05 Aug 2018
Equal But Not The Same: Understanding the Implicit Relationship Between Persuasive Images and Text
Mingda Zhang
R. Hwa
Adriana Kovashka
11
54
0
21 Jul 2018
Presentation Attack Detection for Cadaver Iris
Mateusz Trokielewicz
A. Czajka
P. Maciejewicz
CVBM
17
24
0
11 Jul 2018
Dynamic Multimodal Instance Segmentation guided by natural language queries
Edgar Margffoy-Tuay
Juan C. Pérez
Emilio Botero
Pablo Arbelaez
22
170
0
06 Jul 2018
Face-Cap: Image Captioning using Facial Expression Analysis
Omid Mohamad Nezami
Mark Dras
Peter Anderson
Len Hamey
CVBM
16
27
0
06 Jul 2018
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Yikang Li
Wanli Ouyang
Bolei Zhou
Jianping Shi
Yawen Cui
Xiaogang Wang
GNN
17
273
0
29 Jun 2018
Learning Multimodal Representations for Unseen Activities
A. Piergiovanni
Michael S. Ryoo
SSL
14
4
0
21 Jun 2018
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network
Yabin Zhang
K. Jia
Zhixin Wang
6
23
0
16 Jun 2018
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
14
141
0
11 Jun 2018
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Nayyer Aafaq
Ajmal Saeed Mian
W. Liu
Syed Zulqarnain Gilani
Mubarak Shah
6
91
0
01 Jun 2018
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation
Taehyeong Kim
Min-Oh Heo
Seonil Son
Kyoung-Wha Park
Byoung-Tak Zhang
21
75
0
28 May 2018
Identifying Object States in Cooking-Related Images
Ahmad Babaeian Jelodar
Md Sirajus Salekin
Yu Sun
17
37
0
17 May 2018
Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks
S. Hamid Rezatofighi
Roman Kaskman
F. Motlagh
Javen Qinfeng Shi
Daniel Cremers
Laura Leal-Taixé
Ian Reid
SSL
20
23
0
02 May 2018
Large-Scale Visual Relationship Understanding
Ji Zhang
Yannis Kalantidis
Marcus Rohrbach
Manohar Paluri
Ahmed Elgammal
Mohamed Elhoseiny
14
166
0
27 Apr 2018
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
34
7
0
27 Apr 2018
Entity-aware Image Caption Generation
Di Lu
Spencer Whitehead
Lifu Huang
Heng Ji
Shih-Fu Chang
VLM
12
82
0
21 Apr 2018
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Huijuan Xu
Kun He
Bryan A. Plummer
Leonid Sigal
Stan Sclaroff
Kate Saenko
CLIP
9
319
0
13 Apr 2018
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
Ameya Prabhu
Vishal Batchu
Rohit Gajawada
Sri Aurobindo Munagala
A. Namboodiri
MQ
25
18
0
11 Apr 2018
Decoupled Novel Object Captioner
Yuehua Wu
Linchao Zhu
Lu Jiang
Yi Yang
10
62
0
11 Apr 2018
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Antoine Miech
Ivan Laptev
Josef Sivic
22
233
0
07 Apr 2018
Guess Where? Actor-Supervision for Spatiotemporal Action Localization
Victor Escorcia
Cuong Duc Dao
Mihir Jain
Bernard Ghanem
Cees G. M. Snoek
22
29
0
05 Apr 2018
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
David F. Harwath
Adrià Recasens
Dídac Surís
Galen Chuang
Antonio Torralba
James R. Glass
19
201
0
04 Apr 2018
Guide Me: Interacting with Deep Networks
Christian Rupprecht
Iro Laina
Nassir Navab
Gregory Hager
Federico Tombari
HAI
27
38
0
30 Mar 2018
A New Target-specific Object Proposal Generation Method for Visual Tracking
Guanjun Guo
Hanzi Wang
Yan Yan
H. Liao
Bo-wen Li
16
4
0
27 Mar 2018
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
191
434
0
27 Mar 2018
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Somak Aditya
Yezhou Yang
Chitta Baral
LRM
NAI
ReLM
18
53
0
23 Mar 2018
EVA
2
^2
2
: Exploiting Temporal Redundancy in Live Computer Vision
Mark Buckler
Philip Bedoukian
Suren Jayasuriya
Adrian Sampson
33
75
0
16 Mar 2018
Object Captioning and Retrieval with Natural Language
A. Nguyen
Thanh-Toan Do
Ian Reid
D. Caldwell
Nikos G. Tsagarakis
3DV
14
18
0
16 Mar 2018
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis Tool
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
17
29
0
16 Mar 2018
Approximate Query Matching for Image Retrieval
Abhijit Suprem
Polo Chau
14
1
0
14 Mar 2018
Less Is More: Picking Informative Frames for Video Captioning
Yangyu Chen
Shuhui Wang
W. Zhang
Qingming Huang
12
200
0
05 Mar 2018
Joint Event Detection and Description in Continuous Video Streams
Huijuan Xu
Boyang Albert Li
Vasili Ramanishka
Leonid Sigal
Kate Saenko
6
51
0
28 Feb 2018
Neural Aesthetic Image Reviewer
Wenshan Wang
Su Yang
Weishan Zhang
Jiulong Zhang
14
38
0
28 Feb 2018
Teaching Machines to Code: Neural Markup Generation with Visual Attention
Sumeet S. Singh
14
7
0
15 Feb 2018
FlipDial: A Generative Model for Two-Way Visual Dialogue
Daniela Massiceti
N. Siddharth
P. Dokania
Philip H. S. Torr
MLLM
27
41
0
11 Feb 2018
Generating Triples with Adversarial Networks for Scene Graph Construction
Matthew Klawonn
Eric Heim
GAN
GNN
19
22
0
07 Feb 2018
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text
M. Busta
Yash J. Patel
Jirí Matas
25
91
0
30 Jan 2018
Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention
Kazi Nazmul Haque
M. Yousuf
R. Rana
3DV
19
21
0
16 Jan 2018
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
Ronald M. Summers
MedIm
22
462
0
12 Jan 2018
Visual Text Correction
Amir Mazaheri
M. Shah
36
11
0
06 Jan 2018
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
21
73
0
04 Jan 2018
Exploring Models and Data for Remote Sensing Image Caption Generation
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
14
461
0
21 Dec 2017
Learning to Act Properly: Predicting and Explaining Affordances from Images
Ching-Yao Chuang
Jiaman Li
Antonio Torralba
Sanja Fidler
8
100
0
20 Dec 2017
Attribute CNNs for Word Spotting in Handwritten Documents
Sebastian Sudholt
G. Fink
22
55
0
20 Dec 2017
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Agata Mosinska
Pablo Márquez-Neila
Mateusz Koziñski
Pascal Fua
3DV
18
231
0
06 Dec 2017
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
J. Hoffimann
Youli Mao
A. Wesley
Aimee Taylor
8
15
0
05 Dec 2017
Examining Cooperation in Visual Dialog Models
Mircea Mironenco
D. Kianfar
Ke M. Tran
Evangelos Kanoulas
E. Gavves
20
4
0
04 Dec 2017
Discriminative Learning of Open-Vocabulary Object Retrieval and Localization by Negative Phrase Augmentation
Ryota Hinami
Shiníchi Satoh
ObjD
14
22
0
27 Nov 2017
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
13
118
0
22 Nov 2017
On the Automatic Generation of Medical Imaging Reports
Baoyu Jing
P. Xie
Eric P. Xing
MedIm
27
503
0
22 Nov 2017
Previous
1
2
3
...
10
6
7
8
9
Next