Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning

16 April 2024

Papers citing "Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning"

6 / 6 papers shown

Title
UP-Person: Unified Parameter-Efficient Transfer Learning for Text-based Person Retrieval Yating Liu Yaowei Li Xiangyuan Lan Wenming Yang Zimo Liu Q. Liao 28 0 0 14 Apr 2025
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Junnan Li Dongxu Li Caiming Xiong S. Hoi MLLM BDL VLM CLIP 390 4,125 0 28 Jan 2022
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations Debidatta Dwibedi Y. Aytar Jonathan Tompson P. Sermanet Andrew Zisserman SSL 185 452 0 29 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 295 3,693 0 11 Feb 2021
Similarity Reasoning and Filtration for Image-Text Matching Haiwen Diao Ying Zhang Lingyun Ma Huchuan Lu 206 331 0 05 Jan 2021
ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky Jia Deng Hao Su J. Krause S. Satheesh ... A. Karpathy A. Khosla Michael S. Bernstein Alexander C. Berg Li Fei-Fei VLM ObjD 296 39,194 0 01 Sep 2014