Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.02740
Cited By
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
3 October 2024
Zhengfeng Lai
Vasileios Saveris
C. L. P. Chen
Hong-You Chen
Haotian Zhang
Bowen Zhang
Juan Lao Tebar
Wenze Hu
Zhe Gan
Peter Grasch
Meng Cao
Yinfei Yang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models"
1 / 1 papers shown
Title
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Haotian Zhang
Mingfei Gao
Zhe Gan
Philipp Dufter
Nina Wenzel
...
Haoxuan You
Zirui Wang
Afshin Dehghan
Peter Grasch
Yinfei Yang
VLM
MLLM
36
32
1
30 Sep 2024
1