Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1511.07571
Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning
24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DenseCap: Fully Convolutional Localization Networks for Dense Captioning"
50 / 468 papers shown
Title
Visual Goal-Step Inference using wikiHow
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Yue Yang
Artemis Panagopoulou
Qing Lyu
Li Zhang
Mark Yatskar
Chris Callison-Burch
233
50
0
12 Apr 2021
Multimodal Entity Linking for Tweets
European Conference on Information Retrieval (ECIR), 2020
Omar Adjali
Romaric Besançon
Olivier Ferret
Hervé Le Borgne
Brigitte Grau
144
56
0
07 Apr 2021
FixMyPose: Pose Correctional Captioning and Retrieval
AAAI Conference on Artificial Intelligence (AAAI), 2021
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Joey Tianyi Zhou
142
20
0
04 Apr 2021
Say It All: Feedback for Improving Non-Visual Presentation Accessibility
International Conference on Human Factors in Computing Systems (CHI), 2021
Yi-Hao Peng
JiWoong Jang
Jeffrey P. Bigham
Amy Pavel
122
49
0
26 Mar 2021
3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model
The Florida AI Research Society (FLAIRS), 2021
Chengxi Li
Brent Harrison
177
6
0
20 Mar 2021
Knowledge driven Description Synthesis for Floor Plan Interpretation
International Journal on Document Analysis and Recognition (IJDAR), 2021
Shreya Goyal
Chiranjoy Chattopadhyay
Gaurav Bhatnagar
3DV
90
15
0
15 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
International Journal of Computer Vision (IJCV), 2021
Andrew Shin
Masato Ishii
T. Narihira
209
48
0
06 Mar 2021
Characterization and recognition of handwritten digits using Julia
Md Asifuzzaman Jishan
M. Alam
A. Islam
I. R. Mazumder
K. Mahmud
A. K. Azad
82
0
0
24 Feb 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2021
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
384
272
0
20 Feb 2021
Composing Pick-and-Place Tasks By Grounding Language
International Symposium on Experimental Robotics (ISER), 2021
Oier Mees
Wolfram Burgard
LM&Ro
134
37
0
16 Feb 2021
Improved Bengali Image Captioning via deep convolutional neural network based encoder-decoder model
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
VLM
110
23
0
14 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
International Conference on Machine Learning (ICML), 2021
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
573
604
0
04 Feb 2021
TorchPRISM: Principal Image Sections Mapping, a novel method for Convolutional Neural Network features visualization
Tomasz Szandała
63
1
0
27 Jan 2021
CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions
Qi Feng
Vitaly Ablavsky
Stan Sclaroff
149
50
0
12 Jan 2021
Language-Mediated, Object-Centric Representation Learning
Findings (Findings), 2020
Ruocheng Wang
Jiayuan Mao
S. Gershman
Jiajun Wu
263
13
0
31 Dec 2020
Tensor Composition Net for Visual Relationship Prediction
British Machine Vision Conference (BMVC), 2020
Yuting Qiang
Yongxin Yang
Xueting Zhang
Yanwen Guo
Timothy M. Hospedales
ViT
CoGe
192
2
0
10 Dec 2020
Understanding Guided Image Captioning Performance across Domains
Conference on Computational Natural Language Learning (CoNLL), 2020
Edwin G. Ng
Bo Pang
P. Sharma
Radu Soricut
357
28
0
04 Dec 2020
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Computer Vision and Pattern Recognition (CVPR), 2020
Dave Zhenyu Chen
A. Gholami
Matthias Nießner
Angel X. Chang
3DPC
286
226
0
03 Dec 2020
SuperOCR: A Conversion from Optical Character Recognition to Image Captioning
Baohua Sun
Michael Lin
Hao Sha
Lin Yang
111
5
0
21 Nov 2020
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision
Yujie Zhong
Linhai Xie
Sen Wang
Lucia Specia
Yishu Miao
SSL
102
0
0
19 Nov 2020
iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering
Vasu Sharma
Gurneet Arora
Navpreet Kaloty
176
39
0
16 Nov 2020
MAGNeto: An Efficient Deep Learning Method for the Extractive Tags Summarization Problem
H. Phung
A. Vu
Tung D. Nguyen
Lam Thanh Do
Giang Nam Ngo
Trung Thanh Tran
Hà Nội
ViT
112
0
0
09 Nov 2020
Diverse Image Captioning with Context-Object Split Latent Spaces
Neural Information Processing Systems (NeurIPS), 2020
Shweta Mahajan
Stefan Roth
170
46
0
02 Nov 2020
Boost Image Captioning with Knowledge Reasoning
Machine-mediated learning (ML), 2020
Feicheng Huang
Zhiwen Wang
Haiyang Wei
Canlong Zhang
Huifang Ma
103
27
0
02 Nov 2020
TextMage: The Automated Bangla Caption Generator Based On Deep Learning
Abrar Hasin Kamal
Md Asifuzzaman Jishan
N. Mansoor
VLM
147
21
0
15 Oct 2020
Diagnosing and Preventing Instabilities in Recurrent Video Processing
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
T. Tanay
Aivar Sootla
Matteo Maggioni
P. Dokania
Juil Sock
A. Leonardis
Greg Slabaugh
318
7
0
10 Oct 2020
Dense Relational Image Captioning via Multi-task Triple-Stream Networks
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
343
39
0
08 Oct 2020
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
Mohit Shridhar
Xingdi Yuan
Marc-Alexandre Côté
Yonatan Bisk
Adam Trischler
Matthew J. Hausknecht
LM&Ro
LLMAG
403
618
0
08 Oct 2020
Rescribe: Authoring and Automatically Editing Audio Descriptions
Amy Pavel
G. Reyes
Jeffrey P. Bigham
110
79
0
07 Oct 2020
Spatial Attention as an Interface for Image Captioning Models
P. Sadler
127
0
0
29 Sep 2020
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning
Xiaowei Hu
Xi Yin
Kevin Qinghong Lin
Lijuan Wang
Guang Dai
Jianfeng Gao
Zicheng Liu
VLM
203
58
0
28 Sep 2020
Towards Unique and Informative Captioning of Images
European Conference on Computer Vision (ECCV), 2020
Zeyu Wang
Berthy Feng
Karthik Narasimhan
Olga Russakovsky
153
38
0
08 Sep 2020
CoNCRA: A Convolutional Neural Network Code Retrieval Approach
Brazilian Symposium on Software Engineering (SBES), 2020
Marcelo de Rezende Martins
M. Gerosa
141
13
0
03 Sep 2020
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
Pattern Recognition (Pattern Recognit.), 2020
Jiahao Yu
Zihao Zhu
Yujing Wang
Weifeng Zhang
Yue Hu
Jianlong Tan
169
111
0
31 Aug 2020
Decoupled Variational Embedding for Signed Directed Networks
ACM Transactions on the Web (TWEB), 2020
Xu Chen
Jiangchao Yao
Maosen Li
Ya Zhang
Yanfeng Wang
135
5
0
28 Aug 2020
Matching Guided Distillation
Kaiyu Yue
Jiangfan Deng
Feng Zhou
183
58
0
23 Aug 2020
Weakly supervised cross-domain alignment with optimal transport
Siyang Yuan
Ke Bai
Liqun Chen
Yizhe Zhang
Chenyang Tao
Chunyuan Li
Guoyin Wang
Ricardo Henao
Lawrence Carin
OT
142
7
0
14 Aug 2020
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue
ACM Multimedia (ACM MM), 2020
X. Jiang
Siyi Du
Zengchang Qin
Yajing Sun
Jiahao Yu
244
39
0
11 Aug 2020
Textual Description for Mathematical Equations
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2019
Ajoy Mondal
C. V. Jawahar
167
2
0
07 Aug 2020
Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards
European Conference on Computer Vision (ECCV), 2020
Xuewen Yang
Heming Zhang
Di Jin
Yingru Liu
Chi-Hao Wu
Jianchao Tan
Dongliang Xie
Jue Wang
Xin Wang
206
83
0
06 Aug 2020
Eigen-CAM: Class Activation Map using Principal Components
IEEE International Joint Conference on Neural Network (IJCNN), 2020
Mohammed Bany Muhammad
M. Yeasin
221
464
0
01 Aug 2020
Weakly supervised one-stage vision and language disease detection using large scale pneumonia and pneumothorax studies
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2020
Leo K. Tam
Xiaosong Wang
E. Turkbey
Kevin Lu
Yuhong Wen
Daguang Xu
191
15
0
31 Jul 2020
Comprehensive Image Captioning via Scene Graph Decomposition
European Conference on Computer Vision (ECCV), 2020
Yiwu Zhong
Liwei Wang
Jianshu Chen
Dong Yu
Yin Li
223
136
0
23 Jul 2020
Diverse and Styled Image Captioning Using SVD-Based Mixture of Recurrent Experts
Marzi Heidari
M. Ghatee
A. Nickabadi
Arash Pourhasan Nezhad
DiffM
MoE
124
1
0
07 Jul 2020
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Zihao Zhu
Jiahao Yu
Yujing Wang
Yajing Sun
Yue Hu
Qi Wu
187
146
0
16 Jun 2020
iSeeBetter: Spatio-temporal video super-resolution using recurrent generative back-projection networks
Computational Visual Media (CVM), 2020
Vasu Sharma
John Britto
M. Mani Roja
SupR
207
26
0
13 Jun 2020
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling
Kelei He
C. Lian
Ehsan Adeli
Jing Huo
Yang Gao
Bing-Bin Zhang
Junfeng Zhang
Dinggang Shen
204
1
0
15 May 2020
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Hyounghun Kim
Zineng Tang
Joey Tianyi Zhou
121
31
0
13 May 2020
Towards Embodied Scene Description
Sinan Tan
Huaping Liu
Di Guo
Xinyu Zhang
F. Sun
LM&Ro
118
10
0
30 Apr 2020
Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-Ray Reports
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Baoyu Jing
Zeya Wang
Eric Xing
247
166
0
26 Apr 2020
Previous
1
2
3
4
5
...
8
9
10
Next