ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXiv (abs)PDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 468 papers shown
Title
Visual Question Answering Using Semantic Information from Image
  Descriptions
Visual Question Answering Using Semantic Information from Image Descriptions
Tasmia Tasrin
Md Sultan al Nahian
Brent Harrison
119
0
0
23 Apr 2020
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
J. S. Park
Chandra Bhagavatula
Roozbeh Mottaghi
Ali Farhadi
Yejin Choi
ReLMLRM
151
6
0
22 Apr 2020
ParaCNN: Visual Paragraph Generation via Adversarial Twin Contextual
  CNNs
ParaCNN: Visual Paragraph Generation via Adversarial Twin Contextual CNNs
Shiyang Yan
Yang Hua
N. Robertson
129
7
0
21 Apr 2020
Context-Aware Group Captioning via Self-Attention and Contrastive
  Features
Context-Aware Group Captioning via Self-Attention and Contrastive FeaturesComputer Vision and Pattern Recognition (CVPR), 2020
Zhuowan Li
Quan Hung Tran
Long Mai
Zhe Lin
Alan Yuille
VLM
159
50
0
07 Apr 2020
Semantic Image Manipulation Using Scene Graphs
Semantic Image Manipulation Using Scene GraphsComputer Vision and Pattern Recognition (CVPR), 2020
Helisa Dhamo
Azade Farshad
Iro Laina
Nassir Navab
Gregory Hager
Federico Tombari
Christian Rupprecht
339
133
0
07 Apr 2020
Consistent Multiple Sequence Decoding
Consistent Multiple Sequence Decoding
Bicheng Xu
Leonid Sigal
151
0
0
02 Apr 2020
Detection and Description of Change in Visual Streams
Detection and Description of Change in Visual Streams
Davis Gilton
Ruotian Luo
Rebecca Willett
Gregory Shakhnarovich
AI4TS
162
4
0
27 Mar 2020
Exploring Long Tail Visual Relationship Recognition with Large
  Vocabulary
Exploring Long Tail Visual Relationship Recognition with Large VocabularyIEEE International Conference on Computer Vision (ICCV), 2020
Sherif Abdelkarim
Aniket Agarwal
Panos Achlioptas
Jun Chen
Jiaji Huang
Boyang Albert Li
Kenneth Church
Mohamed Elhoseiny
VLM
425
19
0
25 Mar 2020
Bootstrapping Weakly Supervised Segmentation-free Word Spotting through
  HMM-based Alignment
Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based AlignmentInternational Conference on Frontiers in Handwriting Recognition (ICFHR), 2020
T. Wilkinson
Carl Nettelblad
65
2
0
24 Mar 2020
Multi-modal Dense Video Captioning
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
289
198
0
17 Mar 2020
PointINS: Point-based Instance Segmentation
PointINS: Point-based Instance SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Lu Qi
Xinming Zhang
Yukang Chen
Ying-Cong Chen
Xiangyu Zhang
Jian Sun
Jiaya Jia
ISeg3DPC
204
31
0
13 Mar 2020
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail
  Enhancement
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement
Fangyi Zhu
Lei Li
Zhanyu Ma
Guang Chen
Jun Guo
164
1
0
08 Mar 2020
Say As You Wish: Fine-grained Control of Image Caption Generation with
  Abstract Scene Graphs
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene GraphsComputer Vision and Pattern Recognition (CVPR), 2020
Shizhe Chen
Qin Jin
Peng Wang
Qi Wu
DiffM
282
238
0
01 Mar 2020
A Convolutional Baseline for Person Re-Identification Using Vision and
  Language Descriptions
A Convolutional Baseline for Person Re-Identification Using Vision and Language Descriptions
Ammarah Farooq
Muhammad Awais
F. Yan
J. Kittler
A. Akbari
S. S. Khalid
224
10
0
20 Feb 2020
Weakly Supervised Attention Pyramid Convolutional Neural Network for
  Fine-Grained Visual Classification
Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification
Yifeng Ding
Shaoguo Wen
Jiyang Xie
Dongliang Chang
Zhanyu Ma
Zhongwei Si
Haibin Ling
139
55
0
09 Feb 2020
Visual Concept-Metaconcept Learning
Visual Concept-Metaconcept LearningNeural Information Processing Systems (NeurIPS), 2020
Chi Han
Jiayuan Mao
Chuang Gan
J. Tenenbaum
Jiajun Wu
NAILRM
148
70
0
04 Feb 2020
Learn to Predict Sets Using Feed-Forward Neural Networks
Learn to Predict Sets Using Feed-Forward Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
H. Rezatofighi
Tianyu Zhu
Roman Kaskman
F. Motlagh
Javen Qinfeng Shi
Anton Milan
Zorah Lähner
Laura Leal-Taixé
Ian Reid
SSL
264
17
0
30 Jan 2020
Uncertainty based Class Activation Maps for Visual Question Answering
Uncertainty based Class Activation Maps for Visual Question Answering
Badri N. Patro
Mayank Lunayach
Vinay P. Namboodiri
FAttUQCV
100
1
0
23 Jan 2020
Deep Bayesian Network for Visual Question Generation
Deep Bayesian Network for Visual Question GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
130
18
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Robust Explanations for Visual Question AnsweringIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OODAAML
127
22
0
23 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video CaptioningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
117
22
0
17 Jan 2020
Contextual Sense Making by Fusing Scene Classification, Detections, and
  Events in Full Motion Video
Contextual Sense Making by Fusing Scene Classification, Detections, and Events in Full Motion Video
Marc Bosch
Joseph Nassar
Ben Ortiz
Brendan Lammers
David Lindenbaum
J. Wahl
Robert Mangum
Margaret Smith
77
2
0
16 Jan 2020
CNN 101: Interactive Visual Learning for Convolutional Neural Networks
CNN 101: Interactive Visual Learning for Convolutional Neural Networks
Zijie J. Wang
Robert Turko
Omar Shaikh
Haekyu Park
Nilaksh Das
Fred Hohman
Minsuk Kahng
Duen Horng Chau
SSLHAIFAtt
187
26
0
07 Jan 2020
Personalizing Fast-Forward Videos Based on Visual and Textual Features
  from Social Network
Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social NetworkIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2019
W. Ramos
M. Silva
Edson Roteia Araujo Junior
Alan C. Neves
Erickson R. Nascimento
99
7
0
29 Dec 2019
Vision and Language: from Visual Perception to Content Creation
Vision and Language: from Visual Perception to Content CreationAPSIPA Transactions on Signal and Information Processing (APSIPA TSIP), 2019
Tao Mei
Wei Zhang
Ting Yao
VLM
170
8
0
26 Dec 2019
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
110
4
0
19 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2019
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
214
1,017
0
17 Dec 2019
Neural Network Surgery with Sets
Neural Network Surgery with Sets
Jonathan Raiman
Susan Zhang
Christy Dennison
78
6
0
13 Dec 2019
Multimodal Self-Supervised Learning for Medical Image Analysis
Multimodal Self-Supervised Learning for Medical Image AnalysisInformation Processing in Medical Imaging (IPMI), 2019
Aiham Taleb
Christoph Lippert
T. Klein
Moin Nabi
SSL
294
120
0
11 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized NarrativesEuropean Conference on Computer Vision (ECCV), 2019
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
430
285
0
06 Dec 2019
Siamese Natural Language Tracker: Tracking by Natural Language
  Descriptions with Siamese Trackers
Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers
Qi Feng
Vitaly Ablavsky
Qinxun Bai
Stan Sclaroff
197
21
0
04 Dec 2019
Convolutional STN for Weakly Supervised Object Localization
Convolutional STN for Weakly Supervised Object LocalizationInternational Conference on Pattern Recognition (ICPR), 2019
Akhil Meethal
M. Pedersoli
Soufiane Belharbi
Mohammadhadi Shateri
WSOL
181
12
0
03 Dec 2019
Orderless Recurrent Models for Multi-label Classification
Orderless Recurrent Models for Multi-label ClassificationComputer Vision and Pattern Recognition (CVPR), 2019
V. O. Yazici
Abel Gonzalez-Garcia
Arnau Ramisa
Bartlomiej Twardowski
Joost van de Weijer
SSL
352
104
0
22 Nov 2019
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQAAAAI Conference on Artificial Intelligence (AAAI), 2019
Badri N. Patro
Anupriy
Vinay P. Namboodiri
AAMLFAtt
131
27
0
19 Nov 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual DialogueAAAI Conference on Artificial Intelligence (AAAI), 2019
X. Jiang
Jiahao Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
171
70
0
17 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and ApplicationsIEEE Journal on Selected Topics in Signal Processing (JSTSP), 2019
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAIAI4TS
279
396
0
10 Nov 2019
Predicting the Politics of an Image Using Webly Supervised Data
Predicting the Politics of an Image Using Webly Supervised DataNeural Information Processing Systems (NeurIPS), 2019
Christopher Thomas
Adriana Kovashka
SSL
195
24
0
31 Oct 2019
Movienet: A Movie Multilayer Network Model using Visual and Textual
  Semantic Cues
Movienet: A Movie Multilayer Network Model using Visual and Textual Semantic CuesApplied Network Science (Appl Netw Sci), 2019
Youssef Mourchid
B. Renoust
Olivier Roupin
Lê Văn
H. Cherifi
Mohammed El Hassouni
169
10
0
18 Oct 2019
Dynamic Attention Networks for Task Oriented Grounding
Dynamic Attention Networks for Task Oriented Grounding
S. Dasgupta
Badri N. Patro
Vinay P. Namboodiri
150
1
0
14 Oct 2019
Granular Multimodal Attention Networks for Visual Dialog
Granular Multimodal Attention Networks for Visual Dialog
Badri N. Patro
Shivansh Patel
Vinay P. Namboodiri
200
2
0
13 Oct 2019
SMArT: Training Shallow Memory-aware Transformers for Robotic
  Explainability
SMArT: Training Shallow Memory-aware Transformers for Robotic ExplainabilityIEEE International Conference on Robotics and Automation (ICRA), 2019
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
257
29
0
07 Oct 2019
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and CameraIEEE International Conference on Computer Vision (ICCV), 2019
Iro Armeni
Zhi-Yang He
JunYoung Gwak
Amir Zamir
Martin Fischer
Jitendra Malik
Silvio Savarese
3DV3DPC
265
410
0
06 Oct 2019
A Hierarchical Approach for Visual Storytelling Using Image Description
A Hierarchical Approach for Visual Storytelling Using Image DescriptionInternational Conference on Interactive Digital Storytelling (ICIDS), 2019
Md Sultan al Nahian
Tasmia Tasrin
Sagar Gandhi
Ryan Gaines
Brent Harrison
103
14
0
26 Sep 2019
Inverse Visual Question Answering with Multi-Level Attentions
Inverse Visual Question Answering with Multi-Level AttentionsAsian Conference on Machine Learning (ACML), 2019
Yaser Alwatter
Yuhong Guo
BDL
147
1
0
17 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual DialogPattern Recognition (Pattern Recognit.), 2019
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
277
13
0
11 Sep 2019
FDA: Feature Disruptive Attack
FDA: Feature Disruptive AttackIEEE International Conference on Computer Vision (ICCV), 2019
Aditya Ganeshan
S. VivekB.
R. Venkatesh Babu
AAML
214
129
0
10 Sep 2019
Image Captioning with Very Scarce Supervised Data: Adversarial
  Semi-Supervised Learning Approach
Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning ApproachConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
SSLVLM
190
59
0
05 Sep 2019
Aesthetic Image Captioning From Weakly-Labelled Photographs
Aesthetic Image Captioning From Weakly-Labelled Photographs
Koustav Ghosal
A. Rana
A. Smolic
174
28
0
29 Aug 2019
Towards Unsupervised Image Captioning with Shared Multimodal Embeddings
Towards Unsupervised Image Captioning with Shared Multimodal EmbeddingsIEEE International Conference on Computer Vision (ICCV), 2019
Iro Laina
Christian Rupprecht
Nassir Navab
SSL
154
112
0
25 Aug 2019
Sequential Latent Spaces for Modeling the Intention During Diverse Image
  Captioning
Sequential Latent Spaces for Modeling the Intention During Diverse Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2019
J. Aneja
Harsh Agrawal
Dhruv Batra
Alex Schwing
BDLVLM
129
69
0
22 Aug 2019
Previous
123456...8910
Next