Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.16689
Cited By
ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation
31 August 2023
Weihan Wang
Z. Yang
Bin Xu
Juanzi Li
Yankui Sun
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation"
4 / 4 papers shown
Title
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
385
4,010
0
28 Jan 2022
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation
Yongfei Liu
Chenfei Wu
Shao-Yen Tseng
Vasudev Lal
Xuming He
Nan Duan
CLIP
VLM
44
28
0
22 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
185
403
0
13 Jul 2021
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Mohit Bansal
MLLM
249
518
0
04 Feb 2021
1