Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.04363
Cited By
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
9 May 2022
Chia-Wen Kuo
Z. Kira
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning"
9 / 9 papers shown
Title
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
38
0
0
03 Apr 2025
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
25
14
0
06 Mar 2024
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo
Z. Kira
27
21
0
25 May 2023
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
24
6
0
20 May 2023
CLIP-GCD: Simple Language Guided Generalized Category Discovery
Rabah Ouldnoughi
Chia-Wen Kuo
Z. Kira
VLM
13
14
0
17 May 2023
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
Xiangyang Li
Zihan Wang
Jiahao Yang
Yaowei Wang
Shuqiang Jiang
LM&Ro
13
37
0
28 Mar 2023
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
41
170
0
13 Dec 2020
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
252
3,369
0
09 Mar 2020
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
85
1,442
0
06 Dec 2016
1