Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.15350
Cited By
End-to-End Transformer Based Model for Image Captioning
29 March 2022
Yiyu Wang
Jungang Xu
Yingfei Sun
VLM
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"End-to-End Transformer Based Model for Image Captioning"
31 / 31 papers shown
Title
ViTOC: Vision Transformer and Object-aware Captioner
Feiyang Huang
30
0
0
09 Nov 2024
EVC-MF: End-to-end Video Captioning Network with Multi-scale Features
Tian-Zi Niu
Zhen-Duo Chen
Xin Luo
Xin-Shun Xu
26
0
0
22 Oct 2024
Shifted Window Fourier Transform And Retention For Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
VLM
29
0
0
25 Aug 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
27
0
0
09 Aug 2024
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
Zequn Zeng
Jianqiao Sun
Hao Zhang
Tiansheng Wen
Yudi Su
Yan Xie
Zhengjue Wang
Boli Chen
44
3
0
26 Jul 2024
Channel-Partitioned Windowed Attention And Frequency Learning for Single Image Super-Resolution
Dinh Phu Tran
Dao Duy Hung
Daeyoung Kim
SupR
35
0
0
23 Jul 2024
Graph Transformers: A Survey
Ahsan Shehzad
Feng Xia
Shagufta Abid
Ciyuan Peng
Shuo Yu
Dongyu Zhang
Karin Verspoor
AI4CE
29
9
0
13 Jul 2024
LEMoN: Label Error Detection using Multimodal Neighbors
Haoran Zhang
Aparna Balagopalan
Nassim Oufattole
Hyewon Jeong
Yan Wu
Jiacheng Zhu
Marzyeh Ghassemi
44
0
0
10 Jul 2024
Text Data-Centric Image Captioning with Interactive Prompts
Yiyu Wang
Hao Luo
Jungang Xu
Yingfei Sun
Fan Wang
VLM
30
0
0
28 Mar 2024
Transformer based Multitask Learning for Image Captioning and Object Detection
Debolena Basak
P. K. Srijith
M. Desarkar
16
1
0
10 Mar 2024
GestaltMML: Enhancing Rare Genetic Disease Diagnosis through Multimodal Machine Learning Combining Facial Images and Clinical Texts
Da Wu
Jing Yang
Cong Liu
Tzung-Chien Hsieh
E. Marchi
...
Wendy K. Chung
G. Lyon
Ian D. Krantz
J. Kalish
Kai Wang
34
2
0
23 Dec 2023
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
23
7
0
23 Dec 2023
Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning
Cong Yang
Zuchao Li
Lefei Zhang
29
23
0
02 Dec 2023
What a Whole Slide Image Can Tell? Subtype-guided Masked Transformer for Pathological Image Captioning
Wenkang Qin
Rui Xu
Peixiang Huang
Xiaomin Wu
Heyu Zhang
Lin Luo
9
7
0
31 Oct 2023
PRIOR: Prototype Representation Joint Learning from Medical Images and Reports
Pujin Cheng
Li Lin
Junyan Lyu
Yijin Huang
Wenhan Luo
Xiaoying Tang
MedIm
26
45
0
24 Jul 2023
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
Zijie Song
Zhenzhen Hu
Yuanen Zhou
Ye Zhao
Richang Hong
Meng Wang
19
2
0
19 Jul 2023
Image Captioning with Multi-Context Synthetic Data
Feipeng Ma
Y. Zhou
Fengyun Rao
Yueyi Zhang
Xiaoyan Sun
DiffM
25
7
0
29 May 2023
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
24
6
0
20 May 2023
Transforming Visual Scene Graphs to Image Captions
Xu Yang
Jiawei Peng
Zihua Wang
Haiyang Xu
Qinghao Ye
Chenliang Li
Mingshi Yan
Feisi Huang
Zhangzikang Li
Yu Zhang
39
19
0
03 May 2023
Tuning computer vision models with task rewards
André Susano Pinto
Alexander Kolesnikov
Yuge Shi
Lucas Beyer
Xiaohua Zhai
VLM
25
40
0
16 Feb 2023
Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Binyang Song
Ruilin Zhou
Faez Ahmed
AI4CE
35
40
0
14 Feb 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
80
0
0
05 Jan 2023
Efficient Image Captioning for Edge Devices
Ning Wang
Jiangrong Xie
Hangzai Luo
Qinglin Cheng
Jihao Wu
Mingbo Jia
Linlin Li
VLM
CLIP
23
20
0
18 Dec 2022
Controllable Image Captioning via Prompting
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
14
23
0
04 Dec 2022
Soft Alignment Objectives for Robust Adaptation of Language Generation
Michal vStefánik
Marek Kadlcík
Petr Sojka
22
2
0
29 Nov 2022
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning
Pengpeng Zeng
Jinkuan Zhu
Jingkuan Song
Lianli Gao
VLM
22
27
0
17 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
23
3
0
04 Nov 2022
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
21
21
0
13 Aug 2022
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
8
89
0
31 Jan 2022
Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph
Wentian Zhao
Yao Hu
Heda Wang
Xinxiao Wu
Jiebo Luo
18
47
0
26 Jul 2021
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
41
170
0
13 Dec 2020
1