ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXivPDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 452 papers shown
Title
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million
  Narrated Video Clips
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
25
1,172
0
07 Jun 2019
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zhengjun Zha
Daqing Liu
Hanwang Zhang
Yongdong Zhang
Feng Wu
12
119
0
06 Jun 2019
Contextual Translation Embedding for Visual Relationship Detection and
  Scene Graph Generation
Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation
Zih-Siou Hung
Arun Mallya
Svetlana Lazebnik
ViT
12
14
0
28 May 2019
Beyond Visual Semantics: Exploring the Role of Scene Text in Image
  Understanding
Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding
Arka Ujjal Dey
Suman K. Ghosh
Ernest Valveny
Gaurav Harit
25
23
0
25 May 2019
AttentionRNN: A Structured Spatial Attention Mechanism
AttentionRNN: A Structured Spatial Attention Mechanism
Siddhesh Khandelwal
Leonid Sigal
8
3
0
22 May 2019
Joint Object and State Recognition using Language Knowledge
Joint Object and State Recognition using Language Knowledge
Ahmad Babaeian Jelodar
Yu Sun
20
18
0
13 May 2019
Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting
  for Marketing
Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting for Marketing
Philipp Harzig
D. Zecha
Rainer Lienhart
Carolin Kaiser
René Schallner
14
2
0
06 May 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and
  Sentences From Natural Supervision
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
8
685
0
26 Apr 2019
Challenges and Prospects in Vision and Language Research
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
14
41
0
19 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
A. Schwing
Tamir Hazan
19
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
30
117
0
11 Apr 2019
Modularized Textual Grounding for Counterfactual Resilience
Modularized Textual Grounding for Counterfactual Resilience
Zhiyuan Fang
Shu Kong
Charless C. Fowlkes
Yezhou Yang
12
32
0
07 Apr 2019
VQD: Visual Query Detection in Natural Scenes
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
16
18
0
04 Apr 2019
Context and Attribute Grounded Dense Captioning
Context and Attribute Grounded Dense Captioning
Guojun Yin
Lu Sheng
Bin Liu
Nenghai Yu
Xiaogang Wang
Jing Shao
16
75
0
02 Apr 2019
Recurrent Back-Projection Network for Video Super-Resolution
Recurrent Back-Projection Network for Video Super-Resolution
Muhammad Haris
Gregory Shakhnarovich
Norimichi Ukita
SupR
20
430
0
25 Mar 2019
Neural Sequential Phrase Grounding (SeqGROUND)
Neural Sequential Phrase Grounding (SeqGROUND)
Pelin Dogan
Leonid Sigal
Markus Gross
ObjD
16
51
0
18 Mar 2019
Dense Relational Captioning: Triple-Stream Networks for
  Relationship-Based Captioning
Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
6
84
0
14 Mar 2019
Learning To Follow Directions in Street View
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
16
66
0
01 Mar 2019
CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional
  Neural Networks
CHIP: Channel-wise Disentangled Interpretation of Deep Convolutional Neural Networks
Xinrui Cui
Dan Wang
F. I. Z. Jane Wang
FAtt
BDL
14
12
0
07 Feb 2019
Linearized Multi-Sampling for Differentiable Image Transformation
Linearized Multi-Sampling for Differentiable Image Transformation
Wei Jiang
Weiwei Sun
Andrea Tagliasacchi
Eduard Trulls
K. M. Yi
25
23
0
22 Jan 2019
LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators
LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators
Jianan Li
Jimei Yang
Aaron Hertzmann
Jianming Zhang
Tingfa Xu
GAN
13
226
0
21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
31
321
0
20 Jan 2019
Toward Explainable Fashion Recommendation
Toward Explainable Fashion Recommendation
Pongsate Tangseng
Takayuki Okatani
8
29
0
15 Jan 2019
Epipolar Geometry based Learning of Multi-view Depth and Ego-Motion from
  Monocular Sequences
Epipolar Geometry based Learning of Multi-view Depth and Ego-Motion from Monocular Sequences
V. Prasad
Dipanjan Das
Brojeshwar Bhowmick
MDE
15
9
0
23 Dec 2018
SfMLearner++: Learning Monocular Depth & Ego-Motion using Meaningful
  Geometric Constraints
SfMLearner++: Learning Monocular Depth & Ego-Motion using Meaningful Geometric Constraints
V. Prasad
Brojeshwar Bhowmick
MDE
20
26
0
20 Dec 2018
Detecting unseen visual relations using analogies
Detecting unseen visual relations using analogies
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
11
18
0
13 Dec 2018
Visual Social Relationship Recognition
Visual Social Relationship Recognition
Junnan Li
Yongkang Wong
Qi Zhao
Mohan S. Kankanhalli
33
27
0
13 Dec 2018
Coarse-to-fine: A RNN-based hierarchical attention model for vehicle
  re-identification
Coarse-to-fine: A RNN-based hierarchical attention model for vehicle re-identification
Xiu-Shen Wei
Chen-Da Liu-Zhang
Lingqiao Liu
Chunhua Shen
Jianxin Wu
6
43
0
11 Dec 2018
Neural Word Search in Historical Manuscript Collections
Neural Word Search in Historical Manuscript Collections
T. Wilkinson
Jonas Lindström
Anders Brun
3DV
20
8
0
06 Dec 2018
Interactive Full Image Segmentation by Considering All Regions Jointly
Interactive Full Image Segmentation by Considering All Regions Jointly
E. Agustsson
J. Uijlings
V. Ferrari
VLM
10
74
0
05 Dec 2018
Visual Question Answering as Reading Comprehension
Visual Question Answering as Reading Comprehension
Hui Li
Peng Wang
Chunhua Shen
A. Hengel
9
40
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
12
81
0
28 Nov 2018
MIST: Multiple Instance Spatial Transformer Network
MIST: Multiple Instance Spatial Transformer Network
Baptiste Angles
Shahram Izadi
Simon Kornblith
Andrea Tagliasacchi
K. M. Yi
17
5
0
26 Nov 2018
Show, Control and Tell: A Framework for Generating Controllable and
  Grounded Captions
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
DiffM
18
175
0
26 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
15
53
0
26 Nov 2018
Senti-Attend: Image Captioning using Sentiment and Attention
Senti-Attend: Image Captioning using Sentiment and Attention
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
VLM
11
15
0
24 Nov 2018
Object-oriented Targets for Visual Navigation using Rich Semantic
  Representations
Object-oriented Targets for Visual Navigation using Rich Semantic Representations
Jean-Benoit Delbrouck
Stéphane Dupont
23
3
0
22 Nov 2018
Intention Oriented Image Captions with Guiding Objects
Intention Oriented Image Captions with Guiding Objects
Yue Zheng
Yali Li
Shengjin Wang
11
55
0
19 Nov 2018
Revisiting Image-Language Networks for Open-ended Phrase Detection
Revisiting Image-Language Networks for Open-ended Phrase Detection
Bryan A. Plummer
Kevin J. Shih
Yichen Li
Ke Xu
Svetlana Lazebnik
Stan Sclaroff
Kate Saenko
ObjD
SSeg
13
4
0
17 Nov 2018
Image Captioning as Neural Machine Translation Task in SOCKEYE
Image Captioning as Neural Machine Translation Task in SOCKEYE
Loris Bazzani
Tobias Domhan
F. Hieber
VLM
17
2
0
09 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
28
760
0
06 Oct 2018
Team NimbRo at MBZIRC 2017: Autonomous Valve Stem Turning using a Wrench
Team NimbRo at MBZIRC 2017: Autonomous Valve Stem Turning using a Wrench
Max Schwarz
David Droeschel
Christian Lenz
Arul Selvam Periyasamy
En Yen Puang
Jan Razlaw
Diego Rodriguez
Sebastian Schüller
M. Schreiber
Sven Behnke
6
16
0
06 Oct 2018
RGB-D Object Detection and Semantic Segmentation for Autonomous
  Manipulation in Clutter
RGB-D Object Detection and Semantic Segmentation for Autonomous Manipulation in Clutter
Max Schwarz
Anton Milan
Arul Selvam Periyasamy
Sven Behnke
3DPC
19
162
0
01 Oct 2018
Vector Learning for Cross Domain Representations
Vector Learning for Cross Domain Representations
Shagan Sah
Chi Zhang
Thang Nguyen
D. Peri
Ameya Shringi
R. Ptucha
GAN
11
3
0
27 Sep 2018
Object Detection from Scratch with Deep Supervision
Object Detection from Scratch with Deep Supervision
Zhiqiang Shen
Zhuang Liu
Jianguo Li
Yu-Gang Jiang
Yurong Chen
Xiangyang Xue
ObjD
9
77
0
25 Sep 2018
Image Reassembly Combining Deep Learning and Shortest Path Problem
Image Reassembly Combining Deep Learning and Shortest Path Problem
Marie-Morgane Paumard
David Picard
Hedi Tabia
OCL
3DV
19
24
0
04 Sep 2018
Diverse and Coherent Paragraph Generation from Images
Diverse and Coherent Paragraph Generation from Images
Moitreya Chatterjee
A. Schwing
11
66
0
03 Sep 2018
Wavelet based edge feature enhancement for convolutional neural networks
Wavelet based edge feature enhancement for convolutional neural networks
Dedimuni D. De Silva
Subha Fernando
I. Piyatilake
A. Karunarathne
38
16
0
29 Aug 2018
Multimodal Differential Network for Visual Question Generation
Multimodal Differential Network for Visual Question Generation
Badri N. Patro
Sandeep Kumar
V. Kurmi
Vinay P. Namboodiri
13
41
0
12 Aug 2018
Community Regularization of Visually-Grounded Dialog
Community Regularization of Visually-Grounded Dialog
Akshat Agarwal
Swaminathan Gurumurthy
Vasu Sharma
M. Lewis
Katia P. Sycara
18
10
0
10 Aug 2018
Previous
123...1056789
Next