ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXiv (abs)PDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 468 papers shown
Title
FlipDial: A Generative Model for Two-Way Visual Dialogue
FlipDial: A Generative Model for Two-Way Visual Dialogue
Daniela Massiceti
N. Siddharth
P. Dokania
Juil Sock
MLLM
135
42
0
11 Feb 2018
Generating Triples with Adversarial Networks for Scene Graph
  Construction
Generating Triples with Adversarial Networks for Scene Graph Construction
James Fairbanks
Eric Heim
GANGNN
121
23
0
07 Feb 2018
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene
  Text
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text
M. Busta
Yash J. Patel
Jirí Matas
117
100
0
30 Jan 2018
Image denoising and restoration with CNN-LSTM Encoder Decoder with
  Direct Attention
Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention
Kazi Nazmul Haque
M. Yousuf
R. Rana
3DV
88
22
0
16 Jan 2018
TieNet: Text-Image Embedding Network for Common Thorax Disease
  Classification and Reporting in Chest X-rays
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
Ronald M. Summers
MedIm
165
516
0
12 Jan 2018
Visual Text Correction
Visual Text Correction
Amir Mazaheri
M. Shah
137
11
0
06 Jan 2018
Object Referring in Videos with Language and Human Gaze
Object Referring in Videos with Language and Human Gaze
A. Vasudevan
Dengxin Dai
Luc Van Gool
VOS
197
82
0
04 Jan 2018
Exploring Models and Data for Remote Sensing Image Caption Generation
Exploring Models and Data for Remote Sensing Image Caption GenerationIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2017
Xiaoqiang Lu
Binqiang Wang
Xiangtao Zheng
Xuelong Li
171
614
0
21 Dec 2017
Learning to Act Properly: Predicting and Explaining Affordances from
  Images
Learning to Act Properly: Predicting and Explaining Affordances from Images
Ching-Yao Chuang
Jiaman Li
Antonio Torralba
Sanja Fidler
185
119
0
20 Dec 2017
Attribute CNNs for Word Spotting in Handwritten Documents
Attribute CNNs for Word Spotting in Handwritten DocumentsInternational Journal on Document Analysis and Recognition (IJDAR), 2017
Sebastian Sudholt
G. Fink
119
56
0
20 Dec 2017
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
Agata Mosinska
Pablo Márquez-Neila
Mateusz Koziñski
Pascal Fua
3DV
159
245
0
06 Dec 2017
Sequence Mining and Pattern Analysis in Drilling Reports with Deep
  Natural Language Processing
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
J. Hoffimann
Youli Mao
A. Wesley
Aimee Taylor
44
16
0
05 Dec 2017
Examining Cooperation in Visual Dialog Models
Examining Cooperation in Visual Dialog Models
Mircea Mironenco
D. Kianfar
Ke M. Tran
Evangelos Kanoulas
E. Gavves
111
4
0
04 Dec 2017
Discriminative Learning of Open-Vocabulary Object Retrieval and
  Localization by Negative Phrase Augmentation
Discriminative Learning of Open-Vocabulary Object Retrieval and Localization by Negative Phrase Augmentation
Ryota Hinami
Shiníchi Satoh
ObjD
103
23
0
27 Nov 2017
Conditional Image-Text Embedding Networks
Conditional Image-Text Embedding Networks
Bryan A. Plummer
Paige Kordas
M. Kiapour
Shuai Zheng
Robinson Piramuthu
Svetlana Lazebnik
312
124
0
22 Nov 2017
On the Automatic Generation of Medical Imaging Reports
On the Automatic Generation of Medical Imaging Reports
Baoyu Jing
P. Xie
Eric Xing
MedIm
276
597
0
22 Nov 2017
Asking the Difficult Questions: Goal-Oriented Visual Question Generation
  via Intermediate Rewards
Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards
Junjie Zhang
Qi Wu
Chunhua Shen
Jian Zhang
Jianfeng Lu
Anton Van Den Hengel
LRM
145
29
0
21 Nov 2017
ADVISE: Symbolism and External Knowledge for Decoding Advertisements
ADVISE: Symbolism and External Knowledge for Decoding Advertisements
Keren Ye
Adriana Kovashka
156
57
0
17 Nov 2017
Parallel Attention: A Unified Framework for Visual Object Discovery
  through Dialogs and Queries
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Bohan Zhuang
Qi Wu
Chunhua Shen
Ian Reid
Anton Van Den Hengel
ObjD
151
142
0
17 Nov 2017
Image Captioning and Classification of Dangerous Situations
Image Captioning and Classification of Dangerous Situations
Octavio Arriaga
Paul G. Plöger
Matias Valdenegro-Toro
89
8
0
07 Nov 2017
BENCHIP: Benchmarking Intelligence Processors
BENCHIP: Benchmarking Intelligence ProcessorsJournal of Computational Science and Technology (JCST), 2017
Jinhua Tao
Zidong Du
Qi Guo
Huiying Lan
Lei Zhang
...
Allen Rush
Willian Chen
Shaoli Liu
Yunji Chen
Tianshi Chen
110
40
0
23 Oct 2017
Interactively Picking Real-World Objects with Unconstrained Spoken
  Language Instructions
Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions
Jun Hatori
Yuta Kikuchi
Sosuke Kobayashi
K. Takahashi
Yuta Tsuboi
Y. Unno
W. Ko
Jethro Tan
190
168
0
17 Oct 2017
iVQA: Inverse Visual Question Answering
iVQA: Inverse Visual Question Answering
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
132
52
0
10 Oct 2017
What Does Explainable AI Really Mean? A New Conceptualization of
  Perspectives
What Does Explainable AI Really Mean? A New Conceptualization of Perspectives
Derek Doran
Sarah Schulz
Tarek R. Besold
XAI
149
468
0
02 Oct 2017
Semantic Segmentation from Limited Training Data
Semantic Segmentation from Limited Training Data
Anton Milan
Trung T. Pham
B. V. Kumar
D. Morrison
Adam W. Tow
...
Christopher F. Lehnert
G. Lin
Ian Reid
Peter Corke
Jurgen Leitner
161
52
0
22 Sep 2017
Visual Question Generation as Dual Task of Visual Question Answering
Visual Question Generation as Dual Task of Visual Question Answering
Yikang Li
Nan Duan
Bolei Zhou
Xiao Chu
Wanli Ouyang
Xiaogang Wang
205
172
0
21 Sep 2017
Learning Functional Causal Models with Generative Neural Networks
Learning Functional Causal Models with Generative Neural Networks
Hugo Jair Escalante
Sergio Escalera
Xavier Baro
Isabelle M Guyon
Umut Güçlü
Marcel van Gerven
CMLBDL
362
110
0
15 Sep 2017
Joint Learning of Set Cardinality and State Distribution
Joint Learning of Set Cardinality and State Distribution
S. Hamid Rezatofighi
Anton Milan
Javen Qinfeng Shi
A. Dick
Ian Reid
SSLBDL
196
16
0
13 Sep 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual
  Cross Retrieval
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross RetrievalIEEE International Conference on Computer Vision (ICCV), 2017
Yuming Shen
Li Liu
Ling Shao
Jingkuan Song
133
51
0
08 Aug 2017
Scene Graph Generation from Objects, Phrases and Region Captions
Scene Graph Generation from Objects, Phrases and Region Captions
Yikang Li
Wanli Ouyang
Bolei Zhou
Kun Wang
Xiaogang Wang
215
526
0
31 Jul 2017
Weakly-supervised learning of visual relations
Weakly-supervised learning of visual relations
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
160
197
0
29 Jul 2017
Deep Interactive Region Segmentation and Captioning
Deep Interactive Region Segmentation and Captioning
Ali Sharifi Boroujerdi
M. Khanian
M. Breuß
131
9
0
26 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
538
4,508
0
25 Jul 2017
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600
  Papers Survey
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
Hirokatsu Kataoka
Soma Shirakabe
Yun He
S. Ueta
Teppei Suzuki
...
Ryousuke Takasawa
Masataka Fuchida
Yudai Miyashita
Kazushige Okayasu
Yuta Matsuzaki
213
1
0
20 Jul 2017
Video Question Answering via Attribute-Augmented Attention Network
  Learning
Video Question Answering via Attribute-Augmented Attention Network Learning
Yunan Ye
Zhou Zhao
Yimeng Li
Long Chen
Jun Xiao
Yueting Zhuang
124
114
0
20 Jul 2017
Grounding Spatio-Semantic Referring Expressions for Human-Robot
  Interaction
Grounding Spatio-Semantic Referring Expressions for Human-Robot Interaction
Mohit Shridhar
David Hsu
ObjD
149
21
0
18 Jul 2017
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis
  Network
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Zizhao Zhang
Yuanpu Xie
Fuyong Xing
M. McGough
Ling Yang
MedIm
167
324
0
08 Jul 2017
Visually Grounded Word Embeddings and Richer Visual Features for
  Improving Multimodal Neural Machine Translation
Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation
Jean-Benoit Delbrouck
Stéphane Dupont
Omar Seddati
279
9
0
04 Jul 2017
Pedestrian Alignment Network for Large-scale Person Re-identification
Pedestrian Alignment Network for Large-scale Person Re-identification
Zhedong Zheng
Liang Zheng
Yi Yang
186
486
0
03 Jul 2017
Where to Play: Retrieval of Video Segments using Natural-Language
  Queries
Where to Play: Retrieval of Video Segments using Natural-Language Queries
Sangkuk Lee
Daesik Kim
Myunggi Lee
Jihye Hwang
Nojun Kwak
133
3
0
02 Jul 2017
Paying More Attention to Saliency: Image Captioning with Saliency and
  Context Attention
Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention
Marcella Cornia
Lorenzo Baraldi
Giuseppe Serra
Rita Cucchiara
156
90
0
26 Jun 2017
Using Artificial Tokens to Control Languages for Multilingual Image
  Caption Generation
Using Artificial Tokens to Control Languages for Multilingual Image Caption Generation
Satoshi Tsutsui
David J. Crandall
171
21
0
20 Jun 2017
An Entropy-based Pruning Method for CNN Compression
An Entropy-based Pruning Method for CNN Compression
Jian-Hao Luo
Jianxin Wu
87
195
0
19 Jun 2017
Who Will Share My Image? Predicting the Content Diffusion Path in Online
  Social Networks
Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks
Wenjian Hu
Krishna Kumar Singh
Fanyi Xiao
Jinyoung Han
Chen-Nee Chuah
Yong Jae Lee
GNNDiffM
189
1
0
25 May 2017
Deep image representations using caption generators
Deep image representations using caption generators
Konda Reddy Mopuri
Vishal B. Athreya
R. Venkatesh Babu
VLMSSL
113
1
0
25 May 2017
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on
  Weakly-Supervised Classification and Localization of Common Thorax Diseases
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
Xiaosong Wang
Yifan Peng
Le Lu
Zhiyong Lu
M. Bagheri
Ronald M. Summers
LM&MA
689
2,988
0
05 May 2017
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
Fanyi Xiao
Leonid Sigal
Yong Jae Lee
161
143
0
03 May 2017
Dense-Captioning Events in Videos
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
384
1,430
0
02 May 2017
AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep
  Architecture
AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep Architecture
Suman Saha
Gurkirt Singh
Fabio Cuzzolin
191
72
0
17 Apr 2017
Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal
  Attentions
Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal Attentions
Amir Mazaheri
Dong Zhang
M. Shah
91
12
0
15 Apr 2017
Previous
123...10789
Next