Aligning where to see and what to tell: image caption with region-based attention and scene factorization

20 June 2015

Papers citing "Aligning where to see and what to tell: image caption with region-based attention and scene factorization"

35 / 35 papers shown

Title
Embodied Active Defense: Leveraging Recurrent Feedback to Counter Adversarial Patches Lingxuan Wu Xiao Yang Yinpeng Dong Liuwei Xie Hang Su Jun Zhu AAML 35 2 0 31 Mar 2024
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities Md Farhan Ishmam Md Sakib Hossain Shovon M. F. Mridha Nilanjan Dey 35 36 0 01 Nov 2023
Multi-modal reward for visual relationships-based image captioning Ali Abedi Hossein Karshenas Peyman Adibi 22 2 0 19 Mar 2023
Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition Fuyu Wang Xiaodan Liang Lin Xu Liang Lin MedIm 24 25 0 09 Jan 2021
Image Captioning with Compositional Neural Module Networks Junjiao Tian Jean Oh 9 11 0 10 Jul 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework C. Sur 6 7 0 16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC) C. Sur 17 16 0 15 Feb 2020
aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption C. Sur 10 8 0 27 Jan 2020
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and Context Capture for Language Representation -- A Generalization of Bi Directional LSTM C. Sur BDL 7 6 0 22 Nov 2019
Aesthetic Image Captioning From Weakly-Labelled Photographs Koustav Ghosal A. Rana A. Smolic 17 25 0 29 Aug 2019
Image Captioning using Facial Expression and Attention Omid Mohamad Nezami Mark Dras Stephen Wan Cécile Paris CVBM 17 8 0 08 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods Aditya Mogadala M. Kalimuthu Dietrich Klakow VLM 15 132 0 22 Jul 2019
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding Jian Zheng S. Krishnamurthy Ruxin Chen Min-Hung Chen Zhenhao Ge Xiaohua Li 30 4 0 16 Jun 2019
Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling Hao Zhang Bo Chen Long Tian Zhengjue Wang Mingyuan Zhou DRL 14 6 0 18 May 2019
VrR-VG: Refocusing Visually-Relevant Relationships Yuanzhi Liang Yalong Bai Wei Zhang Xueming Qian Li Zhu Tao Mei 3DH 14 8 0 01 Feb 2019
A Comprehensive Survey of Deep Learning for Image Captioning Md. Zakir Hossain Ferdous Sohel M. Shiratuddin Hamid Laga VLM 3DV 11 758 0 06 Oct 2018
A Survey of the Usages of Deep Learning in Natural Language Processing Dan Otter Julian R. Medina Jugal Kalita VLM 17 11 0 27 Jul 2018
Agile Amulet: Real-Time Salient Object Detection with Contextual Attention Pingping Zhang Luyao Wang D. Wang Huchuan Lu Chunhua Shen ObjD 21 21 0 20 Feb 2018
Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance Aditya Mogadala Umanga Bista Lexing Xie Achim Rettinger 20 7 0 17 Oct 2017
Hierarchical Multi-scale Attention Networks for Action Recognition Shiyang Yan Jeremy S. Smith Wenjin Lu Bailing Zhang 16 37 0 25 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Peter Anderson Xiaodong He Chris Buehler Damien Teney Mark Johnson Stephen Gould Lei Zhang AIMat 27 4,177 0 25 Jul 2017
Image Captioning with Object Detection and Localization Zhongliang Yang Yujin Zhang S. Rehman Yongfeng Huang ObjD VLM 12 47 0 08 Jun 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward Zhou Ren Xiaoyu Wang Ning Zhang Xutao Lv Li-Jia Li 18 324 0 12 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search Kan Chen Trung Bui Chen Fang Zhaowen Wang Ram Nevatia 27 38 0 03 Apr 2017
Areas of Attention for Image Captioning M. Pedersoli Thomas Lucas Cordelia Schmid Jakob Verbeek 25 205 0 03 Dec 2016
Attention-based Memory Selection Recurrent Network for Language Modeling Da-Rong Liu Shun-Po Chuang Hung-yi Lee RALM KELM 27 5 0 26 Nov 2016
Semantic Compositional Networks for Visual Captioning Zhe Gan Chuang Gan Xiaodong He Yunchen Pu Kenneth Tran Jianfeng Gao Lawrence Carin Li Deng CoGe 28 425 0 23 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li-Jia Li VLM 19 169 0 21 Nov 2016
A Semi-supervised Framework for Image Captioning Wenhu Chen Aurélien Lucchi Thomas Hofmann 21 9 0 16 Nov 2016
Video Summarization with Long Short-term Memory Ke Zhang Wei-Lun Chao Fei Sha Kristen Grauman 19 682 0 26 May 2016
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge Qi Wu Chunhua Shen A. Hengel Peng Wang A. Dick 11 360 0 09 Mar 2016
Survey on the attention based RNN model and its applications in computer vision Feng Wang David Tax AI4TS AIMat 11 113 0 25 Jan 2016
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering Kan Chen Jiang Wang Liang-Chieh Chen Haoyuan Gao W. Xu Ram Nevatia 14 286 0 18 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction Anna Rohrbach Marcus Rohrbach Ronghang Hu Trevor Darrell Bernt Schiele 9 493 0 12 Nov 2015
What value do explicit high level concepts have in vision to language problems? Qi Wu Chunhua Shen Lingqiao Liu A. Dick A. Hengel 22 443 0 03 Jun 2015