DenseCap: Fully Convolutional Localization Networks for Dense Captioning

24 November 2015

Li Fei-Fei

Papers citing "DenseCap: Fully Convolutional Localization Networks for Dense Captioning"

50 / 468 papers shown

FlipDial: A Generative Model for Two-Way Visual Dialogue

147

11 Feb 2018

Generating Triples with Adversarial Networks for Scene Graph Construction

James Fairbanks

Eric Heim

GAN GNN

133

07 Feb 2018

E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text

M. Busta

Yash J. Patel

Jirí Matas

125

100

30 Jan 2018

Image denoising and restoration with CNN-LSTM Encoder Decoder with Direct Attention

16 Jan 2018

TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays

Xiaosong Wang

192

520

12 Jan 2018

Visual Text Correction

Amir Mazaheri

M. Shah

137

06 Jan 2018

Object Referring in Videos with Language and Human Gaze

A. Vasudevan

Dengxin Dai

Luc Van Gool

VOS

199

04 Jan 2018

Exploring Models and Data for Remote Sensing Image Caption GenerationIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2017

Xiaoqiang Lu

Binqiang Wang

Xiangtao Zheng

Xuelong Li

210

626

21 Dec 2017

Learning to Act Properly: Predicting and Explaining Affordances from Images

Ching-Yao Chuang

Jiaman Li

Antonio Torralba

Sanja Fidler

191

120

20 Dec 2017

Attribute CNNs for Word Spotting in Handwritten DocumentsInternational Journal on Document Analysis and Recognition (IJDAR), 2017

Sebastian Sudholt

G. Fink

127

20 Dec 2017

Beyond the Pixel-Wise Loss for Topology-Aware Delineation

173

249

06 Dec 2017

Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing

05 Dec 2017

Examining Cooperation in Visual Dialog Models

123

04 Dec 2017

Discriminative Learning of Open-Vocabulary Object Retrieval and Localization by Negative Phrase Augmentation

Ryota Hinami

Shiníchi Satoh

ObjD

123

27 Nov 2017

Conditional Image-Text Embedding Networks

357

124

22 Nov 2017

On the Automatic Generation of Medical Imaging Reports

292

608

22 Nov 2017

Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards

Junjie Zhang

Qi Wu

Chunhua Shen

161

21 Nov 2017

ADVISE: Symbolism and External Knowledge for Decoding Advertisements

Keren Ye

Adriana Kovashka

166

17 Nov 2017

Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries

Bohan Zhuang

Qi Wu

Chunhua Shen

Ian Reid

Anton Van Den Hengel

ObjD

170

143

17 Nov 2017

Image Captioning and Classification of Dangerous Situations

Octavio Arriaga

Paul G. Plöger

Matias Valdenegro-Toro

07 Nov 2017

BENCHIP: Benchmarking Intelligence ProcessorsJournal of Computational Science and Technology (JCST), 2017

...

110

23 Oct 2017

Interactively Picking Real-World Objects with Unconstrained Spoken Language Instructions

214

168

17 Oct 2017

iVQA: Inverse Visual Question Answering

Feng Liu

Tao Xiang

Timothy M. Hospedales

Wankou Yang

Changyin Sun

150

10 Oct 2017

What Does Explainable AI Really Mean? A New Conceptualization of Perspectives

158

471

02 Oct 2017

Semantic Segmentation from Limited Training Data

...

Christopher F. Lehnert

177

22 Sep 2017

Visual Question Generation as Dual Task of Visual Question Answering

Wanli Ouyang

229

172

21 Sep 2017

Learning Functional Causal Models with Generative Neural Networks

392

111

15 Sep 2017

Joint Learning of Set Cardinality and State Distribution

223

13 Sep 2017

Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross RetrievalIEEE International Conference on Computer Vision (ICCV), 2017

Yuming Shen

Li Liu

Ling Shao

Jingkuan Song

145

08 Aug 2017

Scene Graph Generation from Objects, Phrases and Region Captions

Wanli Ouyang

259

530

31 Jul 2017

Weakly-supervised learning of visual relations

189

198

29 Jul 2017

Deep Interactive Region Segmentation and Captioning

Ali Sharifi Boroujerdi

M. Khanian

M. Breuß

136

26 Jul 2017

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Lei Zhang

570

4,536

25 Jul 2017

cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey

...

239

20 Jul 2017

Video Question Answering via Attribute-Augmented Attention Network Learning

Zhou Zhao

128

115

20 Jul 2017

Grounding Spatio-Semantic Referring Expressions for Human-Robot Interaction

Mohit Shridhar

David Hsu

ObjD

171

18 Jul 2017

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

175

326

08 Jul 2017

Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation

Jean-Benoit Delbrouck

Stéphane Dupont

Omar Seddati

284

04 Jul 2017

Pedestrian Alignment Network for Large-scale Person Re-identification

Zhedong Zheng

Liang Zheng

Yi Yang

195

487

03 Jul 2017

Where to Play: Retrieval of Video Segments using Natural-Language Queries

138

02 Jul 2017

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Marcella Cornia

Lorenzo Baraldi

Giuseppe Serra

Rita Cucchiara

173

26 Jun 2017

Using Artificial Tokens to Control Languages for Multilingual Image Caption Generation

Satoshi Tsutsui

David J. Crandall

179

20 Jun 2017

An Entropy-based Pruning Method for CNN Compression

Jian-Hao Luo

Jianxin Wu

197

19 Jun 2017

Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks

197

25 May 2017

Deep image representations using caption generators

113

25 May 2017

ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

Xiaosong Wang

788

3,046

05 May 2017

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures

Fanyi Xiao

Leonid Sigal

Yong Jae Lee

169

143

03 May 2017

Dense-Captioning Events in Videos

Li Fei-Fei

400

1,441

02 May 2017

AMTnet: Action-Micro-Tube Regression by End-to-end Trainable Deep Architecture

Suman Saha

Gurkirt Singh

Fabio Cuzzolin

198

17 Apr 2017

Video Fill In the Blank using LR/RL LSTMs with Spatial-Temporal Attentions

Amir Mazaheri

Dong Zhang

M. Shah

15 Apr 2017