Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2101.10804
Cited By
v1
v2
v3 (latest)
CPTR: Full Transformer Network for Image Captioning
26 January 2021
Wei Liu
Sihan Chen
Longteng Guo
Xinxin Zhu
Jing Liu
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CPTR: Full Transformer Network for Image Captioning"
50 / 50 papers shown
Title
SplineFormer: An Explainable Transformer-Based Approach for Autonomous Endovascular Navigation
Tudor Jianu
Shayan Doust
Mengyun Li
Baoru Huang
Tuong Khanh Long Do
...
Karl Bates
Tung D. Ta
S. Fichera
Pierre Berthet-Rayne
Anh Nguyen
MedIm
121
1
0
08 Jan 2025
ViTOC: Vision Transformer and Object-aware Captioner
Feiyang Huang
311
2
0
09 Nov 2024
CoVLM: Leveraging Consensus from Vision-Language Models for Semi-supervised Multi-modal Fake News Detection
Asian Conference on Computer Vision (ACCV), 2024
Devank
Jayateja Kalla
Soma Biswas
134
2
0
06 Oct 2024
PaveCap: The First Multimodal Framework for Comprehensive Pavement Condition Assessment with Dense Captioning and PCI Estimation
Blessing Agyei Kyem
Eugene Kofi Okrah Denteh
Joshua Kofi Asamoah
Armstrong Aboah
76
6
0
07 Aug 2024
Figuring out Figures: Using Textual References to Caption Scientific Figures
Stanley Cao
Kevin Liu
172
0
0
25 Jun 2024
Towards Retrieval-Augmented Architectures for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
VLM
178
17
0
21 May 2024
Generative Multi-modal Models are Good Class-Incremental Learners
Xusheng Cao
Haori Lu
Linlan Huang
Xialei Liu
Ming-Ming Cheng
CLL
221
23
0
27 Mar 2024
Image Captioning in news report scenario
Tianrui Liu
Qi Cai
Changxin Xu
Bo Hong
Jize Xiong
Yuxin Qiao
Tsungwei Yang
262
17
0
24 Mar 2024
CarbonNet: How Computer Vision Plays a Role in Climate Change? Application: Learning Geomechanics from Subsurface Geometry of CCS to Mitigate Global Warming
Journal of Robotics and Automation Research (JRAR), 2024
Wei Chen
Yun Li
Yuan Tian
AI4CE
160
0
0
09 Mar 2024
Rule-driven News Captioning
Ning Xu
Tingting Zhang
Hongshuo Tian
An-An Liu
176
1
0
08 Mar 2024
Radiology Report Generation Using Transformers Conditioned with Non-imaging Data
Nurbanu Aksoy
Nishant Ravikumar
Alejandro F Frangi
ViT
MedIm
57
14
0
18 Nov 2023
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation
Nurbanu Aksoy
Serge Sharoff
Selçuk Başer
Nishant Ravikumar
Alejandro F Frangi
MedIm
121
6
0
18 Nov 2023
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Sijin Chen
Erik Cambria
Mingsheng Li
Xin Chen
Peng Guo
Yinjie Lei
Gang Yu
Taihao Li
Tao Chen
252
39
0
06 Sep 2023
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training
IEEE International Conference on Computer Vision (ICCV), 2023
Xi Deng
Han Shi
Runhu Huang
Changlin Li
Hang Xu
Jianhua Han
James T. Kwok
Shen Zhao
Wei Zhang
Xiaodan Liang
CLIP
VLM
170
3
0
22 Aug 2023
A Comprehensive Analysis of Real-World Image Captioning and Scene Identification
Sai Suprabhanu Nallapaneni
Subrahmanyam Konakanchi
154
2
0
05 Aug 2023
PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
IEEE International Conference on Computer Vision (ICCV), 2023
Pujin Cheng
Li Lin
Junyan Lyu
Yijin Huang
Tong Lu
Xiaoying Tang
MedIm
332
77
0
24 Jul 2023
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
Computer Vision and Pattern Recognition (CVPR), 2023
Lewei Yao
Jianhua Han
Xiaodan Liang
Danqian Xu
Wei Zhang
Zhenguo Li
Hang Xu
VLM
ObjD
CLIP
255
99
0
10 Apr 2023
SEM-POS: Grammatically and Semantically Correct Video Captioning
Asmar Nadeem
A. Hilton
R. Dawes
Graham A. Thomas
A. Mustafa
153
10
0
26 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
267
190
0
21 Mar 2023
Retrieval-augmented Image Captioning
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
R. Ramos
Desmond Elliott
Bruno Martins
VLM
127
41
0
16 Feb 2023
Adjacent-Level Feature Cross-Fusion With 3-D CNN for Remote Sensing Image Change Detection
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Y. Ye
Mengmeng Wang
Liang Zhou
Guangyang Lei
Jianwei Fan
Yao Qin
3DPC
101
56
0
10 Feb 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
IEEE International Conference on Robotics and Automation (ICRA), 2023
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
137
9
0
17 Jan 2023
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Computer Vision and Pattern Recognition (CVPR), 2023
Sijin Chen
Erik Cambria
Xin Chen
Yinjie Lei
Tao Chen
YU Gang
ViT
182
81
0
06 Jan 2023
Exploring Efficient Few-shot Adaptation for Vision Transformers
C. Xu
Siqian Yang
Yabiao Wang
Zhanxiong Wang
Yanwei Fu
Xiangyang Xue
155
23
0
06 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
401
0
0
05 Jan 2023
Using Human Perception to Regularize Transfer Learning
Justin Dulay
Walter J. Scheirer
156
8
0
15 Nov 2022
Retrieval-Augmented Transformer for Image Captioning
International Conference on Content-Based Multimedia Indexing (CBMI), 2022
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
140
68
0
26 Jul 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
181
6
0
09 Jul 2022
Are metrics measuring what they should? An evaluation of image captioning task metrics
Signal processing. Image communication (SPIC), 2022
Othón González-Chávez
Guillermo Ruiz
Daniela Moctezuma
Tania A. Ramirez-delreal
163
9
0
04 Jul 2022
Automatic Generation of Product-Image Sequence in E-commerce
Knowledge Discovery and Data Mining (KDD), 2022
Xiaochuan Fan
Chi Zhang
Yong-Jie Yang
Yue Shang
Xueying Zhang
Zhen He
Yun Xiao
Bo Long
Lingfei Wu
110
7
0
26 Jun 2022
SYMBA: Symbolic Computation of Squared Amplitudes in High Energy Physics with Machine Learning
Abdulhakim Alnuqaydan
S. Gleyzer
Harrison B. Prosper
311
20
0
17 Jun 2022
Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient Object Detection
Neurocomputing (Neurocomputing), 2022
Chao Zeng
Sam Kwong
ViT
158
30
0
07 Jun 2022
Causal Transformer for Estimating Counterfactual Outcomes
International Conference on Machine Learning (ICML), 2022
Valentyn Melnychuk
Dennis Frauen
Stefan Feuerriegel
CML
219
127
0
14 Apr 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
Shixuan Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TS
VLM
153
40
0
03 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
International Conference on Pattern Recognition (ICPR), 2022
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
152
37
0
21 Feb 2022
Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs
VISIGRAPP (VISIGRAPP), 2022
Daniel Louzada Fernandes
Marcos Henrique Fonseca Ribeiro
F. Cerqueira
Michel Melo Silva
101
7
0
10 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
ACM Computing Surveys (ACM CSUR), 2022
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
282
138
0
31 Jan 2022
RelTR: Relation Transformer for Scene Graph Generation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yuren Cong
M. Yang
Bodo Rosenhahn
ViT
381
172
0
27 Jan 2022
ClipCap: CLIP Prefix for Image Captioning
Ron Mokady
Amir Hertz
Amit H. Bermano
CLIP
VLM
189
781
0
18 Nov 2021
Bangla Image Caption Generation through CNN-Transformer based Encoder-Decoder Network
Yuansan Liu
MD Abdullah Al Nasim
Sourav Saha
Faria Afrin
Raisa Mallik
Sathishkumar Samiappan
ViT
115
16
0
24 Oct 2021
Partially-Supervised Novel Object Captioning Leveraging Context from Paired Data
Shashank Bujimalla
Mahesh Subedar
Omesh Tickoo
166
1
0
10 Sep 2021
Rendezvous: Attention Mechanisms for the Recognition of Surgical Action Triplets in Endoscopic Videos
C. Nwoye
Tong Yu
Cristians Gonzalez
B. Seeliger
Pietro Mascagni
Didier Mutter
J. Marescaux
N. Padoy
219
186
0
07 Sep 2021
Audio Captioning Transformer
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021
Xinhao Mei
Xubo Liu
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
ViT
149
88
0
21 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
267
340
0
14 Jul 2021
Recent Advances and Trends in Multimodal Deep Learning: A Review
Jabeen Summaira
Xi Li
Amin Muhammad Shoib
Songyuan Li
Abdul Jabbar
HAI
307
68
0
24 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
165
615
0
05 May 2021
Multiscale Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
419
1,468
0
22 Apr 2021
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
637
834
0
22 Apr 2021
A Survey on Multimodal Disinformation Detection
International Conference on Computational Linguistics (COLING), 2021
Firoj Alam
S. Cresci
Tanmoy Chakraborty
Fabrizio Silvestri
Dimiter Dimitrov
Giovanni Da San Martino
Shaden Shaar
Hamed Firooz
Preslav Nakov
219
115
0
13 Mar 2021
Remote Sensing Image Change Detection with Transformers
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021
Hao Chen
Zipeng Qi
Zhenwei Shi
ViT
262
1,266
0
27 Feb 2021
1