ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1506.06272
  4. Cited By
Aligning where to see and what to tell: image caption with region-based
  attention and scene factorization

Aligning where to see and what to tell: image caption with region-based attention and scene factorization

20 June 2015
Junqi Jin
Kun Fu
Runpeng Cui
Fei Sha
Changshui Zhang
ArXivPDFHTML

Papers citing "Aligning where to see and what to tell: image caption with region-based attention and scene factorization"

35 / 35 papers shown
Title
Embodied Active Defense: Leveraging Recurrent Feedback to Counter
  Adversarial Patches
Embodied Active Defense: Leveraging Recurrent Feedback to Counter Adversarial Patches
Lingxuan Wu
Xiao Yang
Yinpeng Dong
Liuwei Xie
Hang Su
Jun Zhu
AAML
35
2
0
31 Mar 2024
From Image to Language: A Critical Analysis of Visual Question Answering
  (VQA) Approaches, Challenges, and Opportunities
From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities
Md Farhan Ishmam
Md Sakib Hossain Shovon
M. F. Mridha
Nilanjan Dey
35
36
0
01 Nov 2023
Multi-modal reward for visual relationships-based image captioning
Multi-modal reward for visual relationships-based image captioning
Ali Abedi
Hossein Karshenas
Peyman Adibi
22
2
0
19 Mar 2023
Unifying Relational Sentence Generation and Retrieval for Medical Image
  Report Composition
Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition
Fuyu Wang
Xiaodan Liang
Lin Xu
Liang Lin
MedIm
24
25
0
09 Jan 2021
Image Captioning with Compositional Neural Module Networks
Image Captioning with Compositional Neural Module Networks
Junjiao Tian
Jean Oh
9
11
0
10 Jul 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic
  Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO
  Framework
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
6
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
17
16
0
15 Feb 2020
aiTPR: Attribute Interaction-Tensor Product Representation for Image
  Caption
aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption
C. Sur
10
8
0
27 Jan 2020
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and
  Context Capture for Language Representation -- A Generalization of Bi
  Directional LSTM
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and Context Capture for Language Representation -- A Generalization of Bi Directional LSTM
C. Sur
BDL
7
6
0
22 Nov 2019
Aesthetic Image Captioning From Weakly-Labelled Photographs
Aesthetic Image Captioning From Weakly-Labelled Photographs
Koustav Ghosal
A. Rana
A. Smolic
17
25
0
29 Aug 2019
Image Captioning using Facial Expression and Attention
Image Captioning using Facial Expression and Attention
Omid Mohamad Nezami
Mark Dras
Stephen Wan
Cécile Paris
CVBM
17
8
0
08 Aug 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
15
132
0
22 Jul 2019
Image Captioning with Integrated Bottom-Up and Multi-level Residual
  Top-Down Attention for Game Scene Understanding
Image Captioning with Integrated Bottom-Up and Multi-level Residual Top-Down Attention for Game Scene Understanding
Jian Zheng
S. Krishnamurthy
Ruxin Chen
Min-Hung Chen
Zhenhao Ge
Xiaohua Li
30
4
0
16 Jun 2019
Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
Hao Zhang
Bo Chen
Long Tian
Zhengjue Wang
Mingyuan Zhou
DRL
14
6
0
18 May 2019
VrR-VG: Refocusing Visually-Relevant Relationships
VrR-VG: Refocusing Visually-Relevant Relationships
Yuanzhi Liang
Yalong Bai
Wei Zhang
Xueming Qian
Li Zhu
Tao Mei
3DH
14
8
0
01 Feb 2019
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
11
758
0
06 Oct 2018
A Survey of the Usages of Deep Learning in Natural Language Processing
A Survey of the Usages of Deep Learning in Natural Language Processing
Dan Otter
Julian R. Medina
Jugal Kalita
VLM
17
11
0
27 Jul 2018
Agile Amulet: Real-Time Salient Object Detection with Contextual
  Attention
Agile Amulet: Real-Time Salient Object Detection with Contextual Attention
Pingping Zhang
Luyao Wang
D. Wang
Huchuan Lu
Chunhua Shen
ObjD
21
21
0
20 Feb 2018
Describing Natural Images Containing Novel Objects with Knowledge Guided
  Assitance
Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance
Aditya Mogadala
Umanga Bista
Lexing Xie
Achim Rettinger
20
7
0
17 Oct 2017
Hierarchical Multi-scale Attention Networks for Action Recognition
Hierarchical Multi-scale Attention Networks for Action Recognition
Shiyang Yan
Jeremy S. Smith
Wenjin Lu
Bailing Zhang
16
37
0
25 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
27
4,177
0
25 Jul 2017
Image Captioning with Object Detection and Localization
Image Captioning with Object Detection and Localization
Zhongliang Yang
Yujin Zhang
S. Rehman
Yongfeng Huang
ObjD
VLM
12
47
0
08 Jun 2017
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Deep Reinforcement Learning-based Image Captioning with Embedding Reward
Zhou Ren
Xiaoyu Wang
Ning Zhang
Xutao Lv
Li-Jia Li
18
324
0
12 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search
AMC: Attention guided Multi-modal Correlation Learning for Image Search
Kan Chen
Trung Bui
Chen Fang
Zhaowen Wang
Ram Nevatia
27
38
0
03 Apr 2017
Areas of Attention for Image Captioning
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
25
205
0
03 Dec 2016
Attention-based Memory Selection Recurrent Network for Language Modeling
Attention-based Memory Selection Recurrent Network for Language Modeling
Da-Rong Liu
Shun-Po Chuang
Hung-yi Lee
RALM
KELM
27
5
0
26 Nov 2016
Semantic Compositional Networks for Visual Captioning
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
28
425
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li-Jia Li
VLM
19
169
0
21 Nov 2016
A Semi-supervised Framework for Image Captioning
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurélien Lucchi
Thomas Hofmann
21
9
0
16 Nov 2016
Video Summarization with Long Short-term Memory
Video Summarization with Long Short-term Memory
Ke Zhang
Wei-Lun Chao
Fei Sha
Kristen Grauman
19
682
0
26 May 2016
Image Captioning and Visual Question Answering Based on Attributes and
  External Knowledge
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
Qi Wu
Chunhua Shen
A. Hengel
Peng Wang
A. Dick
11
360
0
09 Mar 2016
Survey on the attention based RNN model and its applications in computer
  vision
Survey on the attention based RNN model and its applications in computer vision
Feng Wang
David Tax
AI4TS
AIMat
11
113
0
25 Jan 2016
ABC-CNN: An Attention Based Convolutional Neural Network for Visual
  Question Answering
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
Kan Chen
Jiang Wang
Liang-Chieh Chen
Haoyuan Gao
W. Xu
Ram Nevatia
14
286
0
18 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
9
493
0
12 Nov 2015
What value do explicit high level concepts have in vision to language
  problems?
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
A. Hengel
22
443
0
03 Jun 2015
1