ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04020
  4. Cited By
A Comprehensive Survey of Deep Learning for Image Captioning

A Comprehensive Survey of Deep Learning for Image Captioning

6 October 2018
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
    VLM
    3DV
ArXivPDFHTML

Papers citing "A Comprehensive Survey of Deep Learning for Image Captioning"

50 / 228 papers shown
Title
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
16
12
0
23 Dec 2022
Towards Generating Diverse Audio Captions via Adversarial Training
Towards Generating Diverse Audio Captions via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
33
2
0
05 Dec 2022
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
Runyu Ding
Jihan Yang
Chuhui Xue
Wenqing Zhang
Song Bai
Xiaojuan Qi
VLM
15
146
0
29 Nov 2022
Deep representation learning: Fundamentals, Perspectives, Applications,
  and Open Challenges
Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges
K. T. Baghaei
Amirreza Payandeh
Pooya Fayyazsanavi
Shahram Rahimi
Zhiqian Chen
Somayeh Bakhtiari Ramezani
FaML
AI4TS
30
6
0
27 Nov 2022
Aesthetically Relevant Image Captioning
Aesthetically Relevant Image Captioning
Zhipeng Zhong
Fei Zhou
Guoping Qiu
31
9
0
25 Nov 2022
Feedback is Needed for Retakes: An Explainable Poor Image Notification
  Framework for the Visually Impaired
Feedback is Needed for Retakes: An Explainable Poor Image Notification Framework for the Visually Impaired
Kazuya Ohata
Shunsuke Kitada
Hitoshi Iyatomi
14
0
0
17 Nov 2022
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal
  Pre-trained Knowledge
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge
Linli Yao
Wei-Neng Chen
Qin Jin
VLM
22
10
0
17 Nov 2022
Vis2Mus: Exploring Multimodal Representation Mapping for Controllable
  Music Generation
Vis2Mus: Exploring Multimodal Representation Mapping for Controllable Music Generation
Runbang Zhang
Yixiao Zhang
Kai Shao
Ying Shan
Gus Xia
21
4
0
10 Nov 2022
CLSE: Corpus of Linguistically Significant Entities
CLSE: Corpus of Linguistically Significant Entities
A. Chuklin
Justin Zhao
Mihir Kale
13
1
0
04 Nov 2022
Physical Adversarial Attack meets Computer Vision: A Decade Survey
Physical Adversarial Attack meets Computer Vision: A Decade Survey
Hui Wei
Hao Tang
Xuemei Jia
Zhixiang Wang
Han-Bing Yu
Zhubo Li
Shiníchi Satoh
Luc Van Gool
Zheng Wang
AAML
27
43
0
30 Sep 2022
M^4I: Multi-modal Models Membership Inference
M^4I: Multi-modal Models Membership Inference
Pingyi Hu
Zihan Wang
Ruoxi Sun
Hu Wang
Minhui Xue
37
26
0
15 Sep 2022
Cross Modal Compression: Towards Human-comprehensible Semantic
  Compression
Cross Modal Compression: Towards Human-comprehensible Semantic Compression
Jiguo Li
Chuanmin Jia
Xinfeng Zhang
Siwei Ma
Wen Gao
9
18
0
06 Sep 2022
Facial Expression Recognition and Image Description Generation in
  Vietnamese
Facial Expression Recognition and Image Description Generation in Vietnamese
Khang Nhut Lam
Kim Thi-Thanh Nguyen
Loc Huu Nguy
Jugal Kalita
3DH
CVBM
15
1
0
12 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the
  Perspective of Digital Deception
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
34
3
0
11 Aug 2022
End-to-end deep learning for directly estimating grape yield from
  ground-based imagery
End-to-end deep learning for directly estimating grape yield from ground-based imagery
A. Olenskyj
B. Sams
Zhenghao Fei
Vishal Singh
P. Raja
G. Bornhorst
J. M. Earles
26
28
0
04 Aug 2022
Visual Recognition by Request
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
16
15
0
28 Jul 2022
Controllable Data Generation by Deep Learning: A Review
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
29
28
0
19 Jul 2022
Relational Future Captioning Model for Explaining Likely Collisions in
  Daily Tasks
Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks
Motonari Kambara
K. Sugiura
17
6
0
19 Jul 2022
Exploring Adversarial Examples and Adversarial Robustness of
  Convolutional Neural Networks by Mutual Information
Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information
Jiebao Zhang
Wenhua Qian
Ren-qi Nie
Jinde Cao
Dan Xu
GAN
AAML
17
0
0
12 Jul 2022
Vision-and-Language Pretraining
Vision-and-Language Pretraining
Thong Nguyen
Cong-Duy Nguyen
Xiaobao Wu
See-Kiong Ng
A. Luu
VLM
CLIP
19
2
0
05 Jul 2022
Gender Artifacts in Visual Datasets
Gender Artifacts in Visual Datasets
Nicole Meister
Dora Zhao
Angelina Wang
V. V. Ramaswamy
Ruth C. Fong
Olga Russakovsky
21
28
0
18 Jun 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
14
3
0
16 Jun 2022
Video-based Human-Object Interaction Detection from Tubelet Tokens
Video-based Human-Object Interaction Detection from Tubelet Tokens
Danyang Tu
Wei Sun
Xiongkuo Min
Guangtao Zhai
Wei Shen
ViT
13
15
0
04 Jun 2022
A Generative Adversarial Network-based Selective Ensemble
  Characteristic-to-Expression Synthesis (SE-CTES) Approach and Its
  Applications in Healthcare
A Generative Adversarial Network-based Selective Ensemble Characteristic-to-Expression Synthesis (SE-CTES) Approach and Its Applications in Healthcare
Yuxuan Li
Ying-Jia Lin
Chenang Liu
23
0
0
29 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Xiao Wang
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
C. L. P. Chen
VLM
21
31
0
26 May 2022
Beyond Greedy Search: Tracking by Multi-Agent Reinforcement
  Learning-based Beam Search
Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search
Xiao Wang
Zhe Chen
Bo Jiang
Jin Tang
B. Luo
Dacheng Tao
37
18
0
19 May 2022
Efficient Gesture Recognition for the Assistance of Visually Impaired
  People using Multi-Head Neural Networks
Efficient Gesture Recognition for the Assistance of Visually Impaired People using Multi-Head Neural Networks
Samer Alashhab
Antonio Javier Gallego
Miguel Ángel Lozano
19
16
0
14 May 2022
Translation between Molecules and Natural Language
Translation between Molecules and Natural Language
Carl N. Edwards
T. Lai
Kevin Ros
Garrett Honke
Kyunghyun Cho
Heng Ji
25
155
0
25 Apr 2022
Visual Attention Methods in Deep Learning: An In-Depth Survey
Visual Attention Methods in Deep Learning: An In-Depth Survey
Mohammed Hassanin
Saeed Anwar
Ibrahim Radwan
F. Khan
Ajmal Saeed Mian
19
145
0
16 Apr 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
19
4
0
15 Apr 2022
Image Captioning In the Transformer Age
Image Captioning In the Transformer Age
Yangliu Xu
Li Li
Haiyang Xu
Songfang Huang
Fei Huang
Jianfei Cai
ViT
14
5
0
15 Apr 2022
Vision Transformers in Medical Computer Vision -- A Contemplative
  Retrospection
Vision Transformers in Medical Computer Vision -- A Contemplative Retrospection
Arshi Parvaiz
Muhammad Anwaar Khalid
Rukhsana Zafar
Huma Ameer
M. Ali
M. Fraz
MedIm
11
59
0
29 Mar 2022
Interactive Robotic Grasping with Attribute-Guided Disambiguation
Interactive Robotic Grasping with Attribute-Guided Disambiguation
Yang Yang
Xibai Lou
Changhyun Choi
11
30
0
15 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Xiao Wang
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
C. L. P. Chen
22
12
0
07 Mar 2022
A Review of Emerging Research Directions in Abstract Visual Reasoning
A Review of Emerging Research Directions in Abstract Visual Reasoning
Mikolaj Malkiñski
Jacek Mañdziuk
23
38
0
21 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient
  Image Captioning
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
16
15
0
11 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
8
89
0
31 Jan 2022
A Frustratingly Simple Approach for End-to-End Image Captioning
A Frustratingly Simple Approach for End-to-End Image Captioning
Ziyang Luo
Yadong Xi
Rongsheng Zhang
Jing Ma
VLM
MLLM
22
16
0
30 Jan 2022
Automatic Audio Captioning using Attention weighted Event based
  Embeddings
Automatic Audio Captioning using Attention weighted Event based Embeddings
Swapnil Bhosale
Rupayan Chakraborty
Sunil Kumar Kopparapu
26
0
0
28 Jan 2022
Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain,
  Active and Continual Few-Shot Learning
Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning
Peyman Bateni
Jarred Barber
Raghav Goyal
Vaden Masrani
Jan Willem van de Meent
Leonid Sigal
Frank D. Wood
BDL
VLM
42
9
0
13 Jan 2022
Technical Language Supervision for Intelligent Fault Diagnosis in
  Process Industry
Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry
Karl Lowenmark
C. Taal
S. Schnabel
Marcus Liwicki
Fredrik Sandin
13
7
0
11 Dec 2021
Multimodal Fake News Detection
Multimodal Fake News Detection
Santiago Alonso-Bartolome
Isabel Segura-Bedmar
17
60
0
09 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
19
45
0
29 Nov 2021
Multi-Glimpse Network: A Robust and Efficient Classification
  Architecture based on Recurrent Downsampled Attention
Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention
S. Tan
Runpei Dong
Kaisheng Ma
22
2
0
03 Nov 2021
Deep Learning in Human Activity Recognition with Wearable Sensors: A
  Review on Advances
Deep Learning in Human Activity Recognition with Wearable Sensors: A Review on Advances
Shibo Zhang
Yaxuan Li
Shen Zhang
Farzad Shahabi
S. Xia
Yuanbei Deng
N. Alshurafa
BDL
20
295
0
31 Oct 2021
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
J. Tan
C. Chan
Joon Huang Chuah
VLM
49
16
0
07 Oct 2021
Learning Structural Representations for Recipe Generation and Food
  Retrieval
Learning Structural Representations for Recipe Generation and Food Retrieval
Hao Wang
Guosheng Lin
S. Hoi
C. Miao
16
28
0
04 Oct 2021
Similar Scenes arouse Similar Emotions: Parallel Data Augmentation for
  Stylized Image Captioning
Similar Scenes arouse Similar Emotions: Parallel Data Augmentation for Stylized Image Captioning
Guodun Li
Yuchen Zhai
Zehao Lin
Yin Zhang
43
21
0
26 Aug 2021
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
Hanbo Zhang
Yunfan Lu
Cunjun Yu
David Hsu
Xuguang Lan
Nanning Zheng
LM&Ro
18
63
0
25 Aug 2021
Explainable Reinforcement Learning for Broad-XAI: A Conceptual Framework
  and Survey
Explainable Reinforcement Learning for Broad-XAI: A Conceptual Framework and Survey
Richard Dazeley
Peter Vamplew
Francisco Cruz
24
59
0
20 Aug 2021
Previous
12345
Next