ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXivPDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 452 papers shown
Title
Improving Deep Visual Representation for Person Re-identification by
  Global and Local Image-language Association
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Dapeng Chen
Hongsheng Li
Xihui Liu
Yantao Shen
Zejian Yuan
Xiaogang Wang
20
134
0
05 Aug 2018
Equal But Not The Same: Understanding the Implicit Relationship Between
  Persuasive Images and Text
Equal But Not The Same: Understanding the Implicit Relationship Between Persuasive Images and Text
Mingda Zhang
R. Hwa
Adriana Kovashka
11
54
0
21 Jul 2018
Presentation Attack Detection for Cadaver Iris
Presentation Attack Detection for Cadaver Iris
Mateusz Trokielewicz
A. Czajka
P. Maciejewicz
CVBM
17
24
0
11 Jul 2018
Dynamic Multimodal Instance Segmentation guided by natural language
  queries
Dynamic Multimodal Instance Segmentation guided by natural language queries
Edgar Margffoy-Tuay
Juan C. Pérez
Emilio Botero
Pablo Arbelaez
22
170
0
06 Jul 2018
Face-Cap: Image Captioning using Facial Expression Analysis
Face-Cap: Image Captioning using Facial Expression Analysis
Omid Mohamad Nezami
Mark Dras
Peter Anderson
Len Hamey
CVBM
16
27
0
06 Jul 2018
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph
  Generation
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Yikang Li
Wanli Ouyang
Bolei Zhou
Jianping Shi
Yawen Cui
Xiaogang Wang
GNN
17
273
0
29 Jun 2018
Learning Multimodal Representations for Unseen Activities
Learning Multimodal Representations for Unseen Activities
A. Piergiovanni
Michael S. Ryoo
SSL
14
4
0
21 Jun 2018
Part-Aware Fine-grained Object Categorization using Weakly Supervised
  Part Detection Network
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network
Yabin Zhang
K. Jia
Zhixin Wang
6
23
0
16 Jun 2018
Interactive Visual Grounding of Referring Expressions for Human-Robot
  Interaction
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
14
141
0
11 Jun 2018
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
Nayyer Aafaq
Ajmal Saeed Mian
W. Liu
Syed Zulqarnain Gilani
Mubarak Shah
6
91
0
01 Jun 2018
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story
  Generation
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation
Taehyeong Kim
Min-Oh Heo
Seonil Son
Kyoung-Wha Park
Byoung-Tak Zhang
21
75
0
28 May 2018
Identifying Object States in Cooking-Related Images
Identifying Object States in Cooking-Related Images
Ahmad Babaeian Jelodar
Md Sirajus Salekin
Yu Sun
17
37
0
17 May 2018
Deep Perm-Set Net: Learn to predict sets with unknown permutation and
  cardinality using deep neural networks
Deep Perm-Set Net: Learn to predict sets with unknown permutation and cardinality using deep neural networks
S. Hamid Rezatofighi
Roman Kaskman
F. Motlagh
Javen Qinfeng Shi
Daniel Cremers
Laura Leal-Taixé
Ian Reid
SSL
20
23
0
02 May 2018
Large-Scale Visual Relationship Understanding
Large-Scale Visual Relationship Understanding
Ji Zhang
Yannis Kalantidis
Marcus Rohrbach
Manohar Paluri
Ahmed Elgammal
Mohamed Elhoseiny
14
166
0
27 Apr 2018
Customized Image Narrative Generation via Interactive Visual Question
  Generation and Answering
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
34
7
0
27 Apr 2018
Entity-aware Image Caption Generation
Entity-aware Image Caption Generation
Di Lu
Spencer Whitehead
Lifu Huang
Heng Ji
Shih-Fu Chang
VLM
12
82
0
21 Apr 2018
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
Huijuan Xu
Kun He
Bryan A. Plummer
Leonid Sigal
Stan Sclaroff
Kate Saenko
CLIP
9
319
0
13 Apr 2018
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
Ameya Prabhu
Vishal Batchu
Rohit Gajawada
Sri Aurobindo Munagala
A. Namboodiri
MQ
25
18
0
11 Apr 2018
Decoupled Novel Object Captioner
Decoupled Novel Object Captioner
Yuehua Wu
Linchao Zhu
Lu Jiang
Yi Yang
10
62
0
11 Apr 2018
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Antoine Miech
Ivan Laptev
Josef Sivic
22
233
0
07 Apr 2018
Guess Where? Actor-Supervision for Spatiotemporal Action Localization
Guess Where? Actor-Supervision for Spatiotemporal Action Localization
Victor Escorcia
Cuong Duc Dao
Mihir Jain
Bernard Ghanem
Cees G. M. Snoek
22
29
0
05 Apr 2018
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory
  Input
Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input
David F. Harwath
Adrià Recasens
Dídac Surís
Galen Chuang
Antonio Torralba
James R. Glass
19
201
0
04 Apr 2018
Guide Me: Interacting with Deep Networks
Guide Me: Interacting with Deep Networks
Christian Rupprecht
Iro Laina
Nassir Navab
Gregory Hager
Federico Tombari
HAI
27
38
0
30 Mar 2018
A New Target-specific Object Proposal Generation Method for Visual
  Tracking
A New Target-specific Object Proposal Generation Method for Visual Tracking
Guanjun Guo
Hanzi Wang
Yan Yan
H. Liao
Bo-wen Li
16
4
0
27 Mar 2018
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
191
434
0
27 Mar 2018
Explicit Reasoning over End-to-End Neural Architectures for Visual
  Question Answering
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Somak Aditya
Yezhou Yang
Chitta Baral
LRM
NAI
ReLM
18
53
0
23 Mar 2018
EVA$^2$: Exploiting Temporal Redundancy in Live Computer Vision
EVA2^22: Exploiting Temporal Redundancy in Live Computer Vision
Mark Buckler
Philip Bedoukian
Suren Jayasuriya
Adrian Sampson
33
75
0
16 Mar 2018
Object Captioning and Retrieval with Natural Language
Object Captioning and Retrieval with Natural Language
A. Nguyen
Thanh-Toan Do
Ian Reid
D. Caldwell
Nikos G. Tsagarakis
3DV
14
18
0
16 Mar 2018
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis
  Tool
Inverse Visual Question Answering: A New Benchmark and VQA Diagnosis Tool
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
17
29
0
16 Mar 2018
Approximate Query Matching for Image Retrieval
Approximate Query Matching for Image Retrieval
Abhijit Suprem
Polo Chau
14
1
0
14 Mar 2018
Less Is More: Picking Informative Frames for Video Captioning
Less Is More: Picking Informative Frames for Video Captioning
Yangyu Chen
Shuhui Wang
W. Zhang
Qingming Huang
12
200
0
05 Mar 2018
Joint Event Detection and Description in Continuous Video Streams
Joint Event Detection and Description in Continuous Video Streams
Huijuan Xu
Boyang Albert Li
Vasili Ramanishka
Leonid Sigal
Kate Saenko
6
51
0
28 Feb 2018
Neural Aesthetic Image Reviewer
Neural Aesthetic Image Reviewer
Wenshan Wang
Su Yang
Weishan Zhang
Jiulong Zhang
14
38
0
28 Feb 2018
Teaching Machines to Code: Neural Markup Generation with Visual
  Attention
Teaching Machines to Code: Neural Markup Generation with Visual Attention
Sumeet S. Singh
14
7
0
15 Feb 2018
FlipDial: A Generative Model for Two-Way Visual Dialogue
FlipDial: A Generative Model for Two-Way Visual Dialogue
Daniela Massiceti
N. Siddharth
P. Dokania
Philip H. S. Torr
MLLM
27
41
0
11 Feb 2018
Generating Triples with Adversarial Networks for Scene Graph
  Construction
Generating Triples with Adversarial Networks for Scene Graph Construction
Matthew Klawonn
Eric Heim
GAN
GNN
19
22
0
07 Feb 2018
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene
  Text
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text
M. Busta
Yash J. Patel
Jirí Matas
25
91
0
30 Jan 2018
Image denoising and restoration with CNN-LSTM Encoder Decoder with
  Direct Attention
Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention
Kazi Nazmul Haque
M. Yousuf
R. Rana
3DV
19
21
0
16 Jan 2018
TieNet: Text-Image Embedding Network for Common Thorax Disease
  Classification and Reporting in Chest X-rays
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
Ronald M. Summers
MedIm
22
462
0
12 Jan 2018
Visual Text Correction
Visual Text Correction
Amir Mazaheri
M. Shah
36
11
0
06 Jan 2018
Object Referring in Videos with Language and Human Gaze
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
21
73
0
04 Jan 2018
Exploring Models and Data for Remote Sensing Image Caption Generation
Exploring Models and Data for Remote Sensing Image Caption Generation
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
14
461
0
21 Dec 2017
Learning to Act Properly: Predicting and Explaining Affordances from
  Images
Learning to Act Properly: Predicting and Explaining Affordances from Images
Ching-Yao Chuang
Jiaman Li
Antonio Torralba
Sanja Fidler
8
100
0
20 Dec 2017
Attribute CNNs for Word Spotting in Handwritten Documents
Attribute CNNs for Word Spotting in Handwritten Documents
Sebastian Sudholt
G. Fink
22
55
0
20 Dec 2017
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Agata Mosinska
Pablo Márquez-Neila
Mateusz Koziñski
Pascal Fua
3DV
18
231
0
06 Dec 2017
Sequence Mining and Pattern Analysis in Drilling Reports with Deep
  Natural Language Processing
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
J. Hoffimann
Youli Mao
A. Wesley
Aimee Taylor
8
15
0
05 Dec 2017
Examining Cooperation in Visual Dialog Models
Examining Cooperation in Visual Dialog Models
Mircea Mironenco
D. Kianfar
Ke M. Tran
Evangelos Kanoulas
E. Gavves
20
4
0
04 Dec 2017
Discriminative Learning of Open-Vocabulary Object Retrieval and
  Localization by Negative Phrase Augmentation
Discriminative Learning of Open-Vocabulary Object Retrieval and Localization by Negative Phrase Augmentation
Ryota Hinami
Shiníchi Satoh
ObjD
14
22
0
27 Nov 2017
Conditional Image-Text Embedding Networks
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
13
118
0
22 Nov 2017
On the Automatic Generation of Medical Imaging Reports
On the Automatic Generation of Medical Imaging Reports
Baoyu Jing
P. Xie
Eric P. Xing
MedIm
27
503
0
22 Nov 2017
Previous
123...106789
Next