ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05963
  4. Cited By
Image Captioning: Transforming Objects into Words

Image Captioning: Transforming Objects into Words

14 June 2019
Simão Herdade
Armin Kappeler
K. Boakye
Joao Soares
    ViT
ArXivPDFHTML

Papers citing "Image Captioning: Transforming Objects into Words"

50 / 161 papers shown
Title
ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using
  Large Language Models
ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models
Sheng Wang
Zihao Zhao
Xi Ouyang
Qian Wang
Dinggang Shen
LM&MA
MedIm
24
139
0
14 Feb 2023
Multi-modal Machine Learning in Engineering Design: A Review and Future
  Directions
Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Binyang Song
Ruilin Zhou
Faez Ahmed
AI4CE
35
40
0
14 Feb 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
R. Ji
ViT
19
71
0
13 Feb 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
21
10
0
06 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
80
0
0
05 Jan 2023
Noise-aware Learning from Web-crawled Image-Text Data for Image
  Captioning
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning
Woohyun Kang
Jonghwan Mun
Sungjun Lee
Byungseok Roh
VLM
6
18
0
27 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
22
62
0
06 Dec 2022
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Minghui Hu
Chuanxia Zheng
Heliang Zheng
Tat-Jen Cham
Chaoyue Wang
Zuopeng Yang
Dacheng Tao
Ponnuthurai Nagaratnam Suganthan
DiffM
18
23
0
27 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
CLID: Controlled-Length Image Descriptions with Limited Data
Elad Hirsch
A. Tal
VLM
3DV
14
4
0
27 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng-Wei Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
23
17
0
21 Nov 2022
Detect Only What You Specify : Object Detection with Linguistic Target
Detect Only What You Specify : Object Detection with Linguistic Target
Moyuru Yamada
ObjD
VLM
15
0
0
18 Nov 2022
VieCap4H-VLSP 2021: ObjectAoA-Enhancing performance of Object Relation
  Transformer with Attention on Attention for Vietnamese image captioning
VieCap4H-VLSP 2021: ObjectAoA-Enhancing performance of Object Relation Transformer with Attention on Attention for Vietnamese image captioning
Nghia Hieu Nguyen
Duong T.D. Vo
Minh-Quan Ha
ViT
25
1
0
10 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
23
3
0
04 Nov 2022
Text-Only Training for Image Captioning using Noise-Injected CLIP
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
52
94
0
01 Nov 2022
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text
  Generation
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
Yu Zhao
Jianguo Wei
Zhichao Lin
Yueheng Sun
Meishan Zhang
M. Zhang
17
16
0
20 Oct 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
24
46
0
19 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
37
10
0
04 Oct 2022
PromptCast: A New Prompt-based Learning Paradigm for Time Series
  Forecasting
PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting
Hao Xue
Flora D.Salim
AI4TS
16
137
0
20 Sep 2022
Belief Revision based Caption Re-ranker with Visual Semantic Information
Belief Revision based Caption Re-ranker with Visual Semantic Information
Ahmed Sabir
Francesc Moreno-Noguer
Pranava Madhyastha
Lluís Padró
BDL
14
2
0
16 Sep 2022
vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain
  using Swin Transformer and Attention-based LSTM
vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM
THANH VAN NGUYEN
Long H. Nguyen
Nhat Truong Pham
Liu Tai Nguyen
Van Huong Do
Hai Nguyen
Ngoc Duy Nguyen
VLM
ViT
13
1
0
03 Sep 2022
Exploiting Multiple Sequence Lengths in Fast End to End Training for
  Image Captioning
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
23
21
0
13 Aug 2022
Retrieval-Augmented Transformer for Image Captioning
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
19
57
0
26 Jul 2022
A Guide to Image and Video based Small Object Detection using Deep
  Learning : Case Study of Maritime Surveillance
A Guide to Image and Video based Small Object Detection using Deep Learning : Case Study of Maritime Surveillance
Aref Miri Rekavandi
Lian Xu
F. Boussaïd
A. Seghouane
Stephen Hoefs
Bennamoun
ObjD
18
17
0
26 Jul 2022
Efficient Modeling of Future Context for Image Captioning
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
29
14
0
22 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
25
106
0
20 Jul 2022
A Baseline for Detecting Out-of-Distribution Examples in Image
  Captioning
A Baseline for Detecting Out-of-Distribution Examples in Image Captioning
Gabi Shalev
Gal-Lev Shalev
Joseph Keshet
OODD
19
7
0
12 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image
  Captioning
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
33
3
0
07 Jul 2022
ZoDIAC: Zoneout Dropout Injection Attention Calculation
ZoDIAC: Zoneout Dropout Injection Attention Calculation
Zanyar Zohourianshahzadi
Jugal Kalita
26
0
0
28 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
13
87
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
47
525
0
13 Jun 2022
Modeling Image Composition for Complex Scene Generation
Modeling Image Composition for Complex Scene Generation
Zuopeng Yang
Daqing Liu
Chaoyue Wang
J. Yang
Dacheng Tao
ViT
34
50
0
02 Jun 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual
  Context for Image Captioning
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
16
52
0
09 May 2022
Controllable Image Captioning
Luka Maxwell
28
0
0
28 Apr 2022
Translation between Molecules and Natural Language
Translation between Molecules and Natural Language
Carl N. Edwards
T. Lai
Kevin Ros
Garrett Honke
Kyunghyun Cho
Heng Ji
25
155
0
25 Apr 2022
Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds
Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds
Heng Wang
Chaoyi Zhang
Jianhui Yu
Weidong (Tom) Cai
3DPC
17
38
0
22 Apr 2022
Situational Perception Guided Image Matting
Situational Perception Guided Image Matting
Bo Xu
Jiake Xie
Han Huang
Zi-Jun Li
Cheng Lu
Yong Tang
Yandong Guo
17
3
0
20 Apr 2022
Towards Lightweight Transformer via Group-wise Transformation for
  Vision-and-Language Tasks
Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks
Gen Luo
Yiyi Zhou
Xiaoshuai Sun
Yan Wang
Liujuan Cao
Yongjian Wu
Feiyue Huang
Rongrong Ji
ViT
12
43
0
16 Apr 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
19
4
0
15 Apr 2022
End-to-End Transformer Based Model for Image Captioning
End-to-End Transformer Based Model for Image Captioning
Yiyu Wang
Jungang Xu
Yingfei Sun
VLM
ViT
26
117
0
29 Mar 2022
Semantic Distillation Guided Salient Object Detection
Semantic Distillation Guided Salient Object Detection
Bo Xu
Guanze Liu
Han Huang
Cheng Lu
Yandong Guo
8
2
0
08 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large
  Models
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
S. Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TS
VLM
24
36
0
03 Mar 2022
Enhancing Satellite Imagery using Deep Learning for the Sensor To
  Shooter Timeline
Enhancing Satellite Imagery using Deep Learning for the Sensor To Shooter Timeline
Matthew Ciolino
Dom Hambrick
David A. Noever
103
0
0
28 Feb 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
25
27
0
21 Feb 2022
XFBoost: Improving Text Generation with Controllable Decoders
XFBoost: Improving Text Generation with Controllable Decoders
Xiangyu Peng
Michael Sollami
17
1
0
16 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient
  Image Captioning
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
18
15
0
11 Feb 2022
Describing image focused in cognitive and visual details for visually
  impaired people: An approach to generating inclusive paragraphs
Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs
Daniel Louzada Fernandes
Marcos Henrique Fonseca Ribeiro
F. Cerqueira
Michel Melo Silva
12
6
0
10 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
8
89
0
31 Jan 2022
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional
  Vision-Language Generation
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation
Han Zhang
Weichong Yin
Yewei Fang
Lanxin Li
Boqiang Duan
Zhihua Wu
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
27
58
0
31 Dec 2021
TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning
TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning
Shiming Chen
Zi-Quan Hong
Wenjin Hou
Guosen Xie
Yibing Song
Jian-jun Zhao
Xinge You
Shuicheng Yan
Ling Shao
ViT
17
44
0
16 Dec 2021
TransZero: Attribute-guided Transformer for Zero-Shot Learning
TransZero: Attribute-guided Transformer for Zero-Shot Learning
Shiming Chen
Ziming Hong
Yang Liu
Guosen Xie
Baigui Sun
Hao Li
Qinmu Peng
Kelvin Lu
Xinge You
ViT
37
131
0
03 Dec 2021
Previous
1234
Next