ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXivPDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 452 papers shown
Title
Learn to Predict Sets Using Feed-Forward Neural Networks
Learn to Predict Sets Using Feed-Forward Neural Networks
H. Rezatofighi
Tianyu Zhu
Roman Kaskman
F. Motlagh
Javen Qinfeng Shi
Anton Milan
Daniel Cremers
Laura Leal-Taixé
Ian Reid
SSL
46
15
0
30 Jan 2020
Uncertainty based Class Activation Maps for Visual Question Answering
Uncertainty based Class Activation Maps for Visual Question Answering
Badri N. Patro
Mayank Lunayach
Vinay P. Namboodiri
FAtt
UQCV
11
1
0
23 Jan 2020
Deep Bayesian Network for Visual Question Generation
Deep Bayesian Network for Visual Question Generation
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
9
19
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OOD
AAML
4
20
0
23 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
15
19
0
17 Jan 2020
Contextual Sense Making by Fusing Scene Classification, Detections, and
  Events in Full Motion Video
Contextual Sense Making by Fusing Scene Classification, Detections, and Events in Full Motion Video
Marc Bosch
Joseph Nassar
Ben Ortiz
Brendan Lammers
David Lindenbaum
J. Wahl
Robert Mangum
Margaret Smith
11
2
0
16 Jan 2020
CNN 101: Interactive Visual Learning for Convolutional Neural Networks
CNN 101: Interactive Visual Learning for Convolutional Neural Networks
Zijie J. Wang
Robert Turko
Omar Shaikh
Haekyu Park
Nilaksh Das
Fred Hohman
Minsuk Kahng
Duen Horng Chau
SSL
HAI
FAtt
11
25
0
07 Jan 2020
Personalizing Fast-Forward Videos Based on Visual and Textual Features
  from Social Network
Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network
W. Ramos
M. Silva
Edson Roteia Araujo Junior
Alan C. Neves
Erickson R. Nascimento
14
6
0
29 Dec 2019
Vision and Language: from Visual Perception to Content Creation
Vision and Language: from Visual Perception to Content Creation
Tao Mei
Wei Zhang
Ting Yao
VLM
8
8
0
26 Dec 2019
Deep Exemplar Networks for VQA and VQG
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
13
4
0
19 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
14
868
0
17 Dec 2019
Neural Network Surgery with Sets
Neural Network Surgery with Sets
Jonathan Raiman
Susan Zhang
Christy Dennison
11
4
0
13 Dec 2019
Multimodal Self-Supervised Learning for Medical Image Analysis
Multimodal Self-Supervised Learning for Medical Image Analysis
Aiham Taleb
C. Lippert
T. Klein
Moin Nabi
SSL
8
95
0
11 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
22
240
0
06 Dec 2019
Siamese Natural Language Tracker: Tracking by Natural Language
  Descriptions with Siamese Trackers
Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers
Qi Feng
Vitaly Ablavsky
Qinxun Bai
Stan Sclaroff
21
17
0
04 Dec 2019
Convolutional STN for Weakly Supervised Object Localization
Convolutional STN for Weakly Supervised Object Localization
Akhil Meethal
M. Pedersoli
Soufiane Belharbi
Eric Granger
WSOL
14
12
0
03 Dec 2019
Orderless Recurrent Models for Multi-label Classification
Orderless Recurrent Models for Multi-label Classification
V. O. Yazici
Abel Gonzalez-Garcia
Arnau Ramisa
Bartlomiej Twardowski
Joost van de Weijer
SSL
9
92
0
22 Nov 2019
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA
Badri N. Patro
Anupriy
Vinay P. Namboodiri
AAML
FAtt
37
26
0
19 Nov 2019
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in
  Visual Dialogue
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue
X. Jiang
J. Yu
Zengchang Qin
Yingying Zhuang
Xingxing Zhang
Yue Hu
Qi Wu
15
70
0
17 Nov 2019
Multimodal Intelligence: Representation Learning, Information Fusion,
  and Applications
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications
Chao Zhang
Zichao Yang
Xiaodong He
Li Deng
HAI
AI4TS
27
320
0
10 Nov 2019
Predicting the Politics of an Image Using Webly Supervised Data
Predicting the Politics of an Image Using Webly Supervised Data
Christopher Thomas
Adriana Kovashka
SSL
14
21
0
31 Oct 2019
Movienet: A Movie Multilayer Network Model using Visual and Textual
  Semantic Cues
Movienet: A Movie Multilayer Network Model using Visual and Textual Semantic Cues
Youssef Mourchid
B. Renoust
Olivier Roupin
Lê Văn
H. Cherifi
Mohammed El Hassouni
9
10
0
18 Oct 2019
Dynamic Attention Networks for Task Oriented Grounding
Dynamic Attention Networks for Task Oriented Grounding
S. Dasgupta
Badri N. Patro
Vinay P. Namboodiri
16
1
0
14 Oct 2019
Granular Multimodal Attention Networks for Visual Dialog
Granular Multimodal Attention Networks for Visual Dialog
Badri N. Patro
Shivansh Patel
Vinay P. Namboodiri
22
1
0
13 Oct 2019
SMArT: Training Shallow Memory-aware Transformers for Robotic
  Explainability
SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
6
27
0
07 Oct 2019
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
Iro Armeni
Zhi-Yang He
JunYoung Gwak
Amir Zamir
Martin Fischer
Jitendra Malik
Silvio Savarese
3DV
3DPC
28
336
0
06 Oct 2019
A Hierarchical Approach for Visual Storytelling Using Image Description
A Hierarchical Approach for Visual Storytelling Using Image Description
Md Sultan al Nahian
Tasmia Tasrin
Sagar Gandhi
Ryan Gaines
Brent Harrison
7
11
0
26 Sep 2019
Inverse Visual Question Answering with Multi-Level Attentions
Inverse Visual Question Answering with Multi-Level Attentions
Yaser Alwatter
Yuhong Guo
BDL
14
1
0
17 Sep 2019
Probabilistic framework for solving Visual Dialog
Probabilistic framework for solving Visual Dialog
Badri N. Patro
Anupriy
Vinay P. Namboodiri
BDL
22
13
0
11 Sep 2019
FDA: Feature Disruptive Attack
FDA: Feature Disruptive Attack
Aditya Ganeshan
S. VivekB.
R. Venkatesh Babu
AAML
6
100
0
10 Sep 2019
Image Captioning with Very Scarce Supervised Data: Adversarial
  Semi-Supervised Learning Approach
Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach
Dong-Jin Kim
Jinsoo Choi
Tae-Hyun Oh
In So Kweon
SSL
VLM
14
56
0
05 Sep 2019
Aesthetic Image Captioning From Weakly-Labelled Photographs
Aesthetic Image Captioning From Weakly-Labelled Photographs
Koustav Ghosal
A. Rana
A. Smolic
17
25
0
29 Aug 2019
Towards Unsupervised Image Captioning with Shared Multimodal Embeddings
Towards Unsupervised Image Captioning with Shared Multimodal Embeddings
Iro Laina
Christian Rupprecht
Nassir Navab
SSL
15
103
0
25 Aug 2019
Sequential Latent Spaces for Modeling the Intention During Diverse Image
  Captioning
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
J. Aneja
Harsh Agrawal
Dhruv Batra
A. Schwing
BDL
VLM
18
66
0
22 Aug 2019
Anomaly Detection in Video Sequence with Appearance-Motion
  Correspondence
Anomaly Detection in Video Sequence with Appearance-Motion Correspondence
Trong-Nguyen Nguyen
J. Meunier
8
342
0
17 Aug 2019
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
U-CAM: Visual Explanation using Uncertainty based Class Activation Maps
Badri N. Patro
Mayank Lunayach
Shivansh Patel
Vinay P. Namboodiri
FAtt
UQCV
19
76
0
17 Aug 2019
Survey on Deep Neural Networks in Speech and Vision Systems
Survey on Deep Neural Networks in Speech and Vision Systems
M. Alam
Manar D. Samad
Lasitha Vidyaratne
Alexander M. Glandon
Khan M. Iftekharuddin
3DV
VLM
AI4TS
26
205
0
16 Aug 2019
Image Captioning using Facial Expression and Attention
Image Captioning using Facial Expression and Attention
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
CVBM
17
8
0
08 Aug 2019
Addressing Data Bias Problems for Chest X-ray Image Report Generation
Addressing Data Bias Problems for Chest X-ray Image Report Generation
Philipp Harzig
Yan-Ying Chen
Francine Chen
Rainer Lienhart
MedIm
6
50
0
06 Aug 2019
Logic could be learned from images
Logic could be learned from images
Q. Guo
Y. Qian
Xinyan Liang
Yanhong She
Deyu Li
Jiye Liang
NAI
15
4
0
06 Aug 2019
Cascaded Revision Network for Novel Object Captioning
Cascaded Revision Network for Novel Object Captioning
Qianyu Feng
Yu Wu
Hehe Fan
C. Yan
Yezhou Yang
18
35
0
06 Aug 2019
Prediction and Description of Near-Future Activities in Video
Prediction and Description of Near-Future Activities in Video
T. Mahmud
Mohammad Billah
Mahmudul Hasan
A. Roy-Chowdhury
10
16
0
02 Aug 2019
Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph
  Generation
Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation
Yadan Luo
Zi Huang
Zheng-Wei Zhang
Ziwei Wang
Jingjing Li
Yang Yang
24
40
0
01 Aug 2019
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a
  Mapping from Parts Detected in Multiple Views to Sentences
ShapeCaptioner: Generative Caption Network for 3D Shapes by Learning a Mapping from Parts Detected in Multiple Views to Sentences
Zhizhong Han
Chao Chen
Yu-Shen Liu
Matthias Zwicker
3DPC
27
45
0
31 Jul 2019
Real-time Visual Object Tracking with Natural Language Description
Real-time Visual Object Tracking with Natural Language Description
Qi Feng
Vitaly Ablavsky
Qinxun Bai
Guorong Li
Stan Sclaroff
VLM
ObjD
VOT
8
49
0
26 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
15
132
0
22 Jul 2019
Image Captioning with Integrated Bottom-Up and Multi-level Residual
  Top-Down Attention for Game Scene Understanding
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding
Jian Zheng
S. Krishnamurthy
Ruxin Chen
Min-Hung Chen
Zhenhao Ge
Xiaohua Li
35
4
0
16 Jun 2019
Speeding up VP9 Intra Encoder with Hierarchical Deep Learning Based
  Partition Prediction
Speeding up VP9 Intra Encoder with Hierarchical Deep Learning Based Partition Prediction
Somdyuti Paul
A. Norkin
A. Bovik
16
14
0
15 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph
  Captions
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim
Mohit Bansal
CoGe
11
20
0
14 Jun 2019
Image Captioning: Transforming Objects into Words
Image Captioning: Transforming Objects into Words
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
ViT
8
462
0
14 Jun 2019
Previous
123456...8910
Next