ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.07571
  4. Cited By
DenseCap: Fully Convolutional Localization Networks for Dense Captioning

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015
Justin Johnson
A. Karpathy
Li Fei-Fei
    VLM
ArXivPDFHTML

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 452 papers shown
Title
Magnifying Networks for Images with Billions of Pixels
Magnifying Networks for Images with Billions of Pixels
Neofytos Dimitriou
Ognjen Arandjelovic
16
2
0
12 Dec 2021
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning
  and Visual Grounding
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Qirui Wu
Matthias Nießner
Angel X. Chang
19
29
0
02 Dec 2021
Object-Centric Unsupervised Image Captioning
Object-Centric Unsupervised Image Captioning
Zihang Meng
David Yang
Xuefei Cao
Ashish Shah
Ser-Nam Lim
OCL
VLM
19
11
0
02 Dec 2021
ContIG: Self-supervised Multimodal Contrastive Learning for Medical
  Imaging with Genetics
ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with Genetics
Aiham Taleb
Matthias Kirchler
Remo Monti
C. Lippert
SSL
MedIm
28
54
0
26 Nov 2021
Talk-to-Resolve: Combining scene understanding and spatial dialogue to
  resolve granular task ambiguity for a collocated robot
Talk-to-Resolve: Combining scene understanding and spatial dialogue to resolve granular task ambiguity for a collocated robot
Pradip Pramanick
Chayan Sarkar
Snehasis Banerjee
Brojeshwar Bhowmick
11
14
0
22 Nov 2021
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object
  Segmentation
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation
Laurynas Karazija
Iro Laina
Christian Rupprecht
3DV
VOS
27
83
0
19 Nov 2021
Single-Modal Entropy based Active Learning for Visual Question Answering
Single-Modal Entropy based Active Learning for Visual Question Answering
Dong-Jin Kim
Jae-Won Cho
Jinsoo Choi
Yunjae Jung
In So Kweon
25
12
0
21 Oct 2021
Integrating Visuospatial, Linguistic and Commonsense Structure into
  Story Visualization
Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization
A. Maharana
Mohit Bansal
14
57
0
21 Oct 2021
A Self-Explainable Stylish Image Captioning Framework via
  Multi-References
A Self-Explainable Stylish Image Captioning Framework via Multi-References
Chengxi Li
Brent Harrison
14
0
0
20 Oct 2021
AUTO-DISCERN: Autonomous Driving Using Common Sense Reasoning
AUTO-DISCERN: Autonomous Driving Using Common Sense Reasoning
Suraj Kothawade
Vinaya Khandelwal
Kinjal Basu
Huaduo Wang
Gopal Gupta
LRM
9
22
0
17 Oct 2021
Topic Scene Graph Generation by Attention Distillation from Caption
Topic Scene Graph Generation by Attention Distillation from Caption
Wenbin Wang
R. Wang
X. Chen
DiffM
17
14
0
12 Oct 2021
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Ling Cheng
Wei Wei
Feida Zhu
Yong-jin Liu
C. Miao
ViT
16
3
0
29 Sep 2021
CIDEr-R: Robust Consensus-based Image Description Evaluation
CIDEr-R: Robust Consensus-based Image Description Evaluation
G. O. D. Santos
Esther Luna Colombini
Sandra Avila
40
30
0
28 Sep 2021
Survey: Transformer based Video-Language Pre-training
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
64
44
0
21 Sep 2021
Image Captioning for Effective Use of Language Models in Knowledge-Based
  Visual Question Answering
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering
Ander Salaberria
Gorka Azkune
Oier López de Lacalle
Aitor Soroa Etxabe
Eneko Agirre
22
59
0
15 Sep 2021
RefineCap: Concept-Aware Refinement for Image Captioning
RefineCap: Concept-Aware Refinement for Image Captioning
Yekun Chai
Shuo Jin
Junliang Xing
VLM
8
0
0
08 Sep 2021
Journalistic Guidelines Aware News Image Captioning
Journalistic Guidelines Aware News Image Captioning
Xuewen Yang
Svebor Karaman
Joel R. Tetreault
Alex Jaimes
12
27
0
07 Sep 2021
Improving Object Detection and Attribute Recognition by Feature
  Entanglement Reduction
Improving Object Detection and Attribute Recognition by Feature Entanglement Reduction
Zhao-Heng Zheng
Arka Sadhu
Ramkant Nevatia
11
2
0
25 Aug 2021
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
Hanbo Zhang
Yunfan Lu
Cunjun Yu
David Hsu
Xuguang Lan
Nanning Zheng
LM&Ro
18
63
0
25 Aug 2021
Caption Generation on Scenes with Seen and Unseen Object Categories
Caption Generation on Scenes with Seen and Unseen Object Categories
B. Demirel
R. G. Cinbis
VLM
15
1
0
13 Aug 2021
Neural Twins Talk & Alternative Calculations
Neural Twins Talk & Alternative Calculations
Zanyar Zohourianshahzadi
Jugal Kalita
17
0
0
05 Aug 2021
Dual Graph Convolutional Networks with Transformer and Curriculum
  Learning for Image Captioning
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image Captioning
Xinzhi Dong
Chengjiang Long
Wenju Xu
Chunxia Xiao
ViT
69
66
0
05 Aug 2021
ReFormer: The Relational Transformer for Image Captioning
ReFormer: The Relational Transformer for Image Captioning
Xuewen Yang
Yingru Liu
Xin Wang
ViT
12
54
0
29 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
53
254
0
14 Jul 2021
Leveraging Explainability for Comprehending Referring Expressions in the
  Real World
Leveraging Explainability for Comprehending Referring Expressions in the Real World
Fethiye Irmak Dogan
G. I. Melsión
Iolanda Leite
37
8
0
12 Jul 2021
Controlled Caption Generation for Images Through Adversarial Attacks
Controlled Caption Generation for Images Through Adversarial Attacks
Nayyer Aafaq
Naveed Akhtar
Wei Liu
M. Shah
Ajmal Saeed Mian
AAML
28
9
0
07 Jul 2021
Morphological Classification of Galaxies in S-PLUS using an Ensemble of
  Convolutional Networks
Morphological Classification of Galaxies in S-PLUS using an Ensemble of Convolutional Networks
N. M. Cardoso
G. B. O. Schwarz
L. O. Dias
C. R. Bom
L. Sodré
C. Mendes de Oliveira
17
0
0
05 Jul 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
27
811
0
14 Jun 2021
Check It Again: Progressive Visual Question Answering via Visual
  Entailment
Check It Again: Progressive Visual Question Answering via Visual Entailment
Q. Si
Zheng Lin
Mingyu Zheng
Peng Fu
Weiping Wang
17
48
0
08 Jun 2021
Giving Commands to a Self-Driving Car: How to Deal with Uncertain
  Situations?
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations?
Thierry Deruyttere
Victor Milewski
Marie-Francine Moens
28
15
0
08 Jun 2021
An End-to-End Breast Tumour Classification Model Using Context-Based
  Patch Modelling- A BiLSTM Approach for Image Classification
An End-to-End Breast Tumour Classification Model Using Context-Based Patch Modelling- A BiLSTM Approach for Image Classification
S. Tripathi
S. Singh
H. Lee
8
43
0
05 Jun 2021
Connecting What to Say With Where to Look by Modeling Human Attention
  Traces
Connecting What to Say With Where to Look by Modeling Human Attention Traces
Zihang Meng
Licheng Yu
Ning Zhang
Tamara L. Berg
Babak Damavandi
Vikas Singh
Amy Bearman
26
25
0
12 May 2021
Analyzing Online Political Advertisements
Analyzing Online Political Advertisements
Danae Sánchez Villegas
S. Mokaram
Nikolaos Aletras
8
11
0
09 May 2021
Towards Accurate Text-based Image Captioning with Content Diversity
  Exploration
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Guanghui Xu
Shuaicheng Niu
Mingkui Tan
Yucheng Luo
Qing Du
Qi Wu
DiffM
17
56
0
23 Apr 2021
Visual Goal-Step Inference using wikiHow
Visual Goal-Step Inference using wikiHow
Yue Yang
Artemis Panagopoulou
Qing Lyu
Li Zhang
Mark Yatskar
Chris Callison-Burch
29
41
0
12 Apr 2021
Multimodal Entity Linking for Tweets
Multimodal Entity Linking for Tweets
Omar Adjali
Romaric Besançon
Olivier Ferret
Hervé Le Borgne
Brigitte Grau
11
48
0
07 Apr 2021
FixMyPose: Pose Correctional Captioning and Retrieval
FixMyPose: Pose Correctional Captioning and Retrieval
Hyounghun Kim
Abhaysinh Zala
Graham Burri
Mohit Bansal
22
16
0
04 Apr 2021
Say It All: Feedback for Improving Non-Visual Presentation Accessibility
Say It All: Feedback for Improving Non-Visual Presentation Accessibility
Yi-Hao Peng
JiWoong Jang
Jeffrey P. Bigham
Amy Pavel
11
33
0
26 Mar 2021
3M: Multi-style image caption generation using Multi-modality features
  under Multi-UPDOWN model
3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model
Chengxi Li
Brent Harrison
20
6
0
20 Mar 2021
Knowledge driven Description Synthesis for Floor Plan Interpretation
Knowledge driven Description Synthesis for Floor Plan Interpretation
Shreya Goyal
Chiranjoy Chattopadhyay
Gaurav Bhatnagar
3DV
23
12
0
15 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal
  Tasks with Language and Vision
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Andrew Shin
Masato Ishii
T. Narihira
33
36
0
06 Mar 2021
Characterization and recognition of handwritten digits using Julia
Characterization and recognition of handwritten digits using Julia
Md Asifuzzaman Jishan
M. Alam
A. Islam
I. R. Mazumder
K. Mahmud
A. K. Azad
17
0
0
24 Feb 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for
  Image Captioning
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
29
218
0
20 Feb 2021
Composing Pick-and-Place Tasks By Grounding Language
Composing Pick-and-Place Tasks By Grounding Language
Oier Mees
Wolfram Burgard
LM&Ro
11
37
0
16 Feb 2021
Improved Bengali Image Captioning via deep convolutional neural network
  based encoder-decoder model
Improved Bengali Image Captioning via deep convolutional neural network based encoder-decoder model
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
VLM
11
18
0
14 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
525
0
04 Feb 2021
TorchPRISM: Principal Image Sections Mapping, a novel method for
  Convolutional Neural Network features visualization
TorchPRISM: Principal Image Sections Mapping, a novel method for Convolutional Neural Network features visualization
Tomasz Szandała
13
1
0
27 Jan 2021
CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural
  Language Descriptions
CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions
Qi Feng
Vitaly Ablavsky
Stan Sclaroff
14
45
0
12 Jan 2021
Language-Mediated, Object-Centric Representation Learning
Language-Mediated, Object-Centric Representation Learning
Ruocheng Wang
Jiayuan Mao
S. Gershman
Jiajun Wu
8
12
0
31 Dec 2020
Tensor Composition Net for Visual Relationship Prediction
Tensor Composition Net for Visual Relationship Prediction
Yuting Qiang
Yongxin Yang
Xueting Zhang
Yanwen Guo
Timothy M. Hospedales
ViT
CoGe
14
2
0
10 Dec 2020
Previous
123456...8910
Next