Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.13218
Cited By
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
25 August 2023
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLM
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning"
11 / 11 papers shown
Title
The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
Mingkai Tian
Guorong Li
Yuankai Qi
Amin Beheshti
J. Shi
Anton van den Hengel
Qingming Huang
VGen
32
0
0
31 Mar 2025
Audio Description Generation in the Era of LLMs and VLMs: A Review of Transferable Generative AI Technologies
Yingqiang Gao
Lukas Fischer
Alexa Lintner
Sarah Ebling
12
0
0
11 Oct 2024
ArcSin: Adaptive ranged cosine Similarity injected noise for Language-Driven Visual Tasks
Yang Liu
Xiaomin Yu
Gongyu Zhang
Christos Bergeles
Prokar Dasgupta
Alejandro Granados
Sebastien Ourselin
27
0
0
27 Feb 2024
Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning
Bang-ju Yang
Yong Dai
Xuxin Cheng
Yaowei Li
Asif Raza
Yuexian Zou
VLM
21
4
0
30 Jan 2024
Mitigating Open-Vocabulary Caption Hallucinations
Assaf Ben-Kish
Moran Yanuka
Morris Alper
Raja Giryes
Hadar Averbuch-Elor
MLLM
VLM
6
6
0
06 Dec 2023
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
Xuxin Cheng
Bowen Cao
Qichen Ye
Zhihong Zhu
Hongxiang Li
Yuexian Zou
8
25
0
19 Nov 2023
ReadMe++: Benchmarking Multilingual Language Models for Multi-Domain Readability Assessment
Tarek Naous
Michael Joseph Ryan
Anton Lavrouk
Mohit Chandra
Wei-ping Xu
16
3
0
23 May 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
40
81
0
06 Mar 2023
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
35
69
0
01 Nov 2022
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
298
771
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
1