Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.19409
Cited By
LLAVADI: What Matters For Multimodal Large Language Models Distillation
28 July 2024
Shilin Xu
Xiangtai Li
Haobo Yuan
Lu Qi
Yunhai Tong
Ming-Hsuan Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLAVADI: What Matters For Multimodal Large Language Models Distillation"
8 / 8 papers shown
Title
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
Saeed Ranjbar Alvar
Gursimran Singh
Mohammad Akbari
Yong Zhang
VLM
68
0
0
04 Mar 2025
DistiLLM: Towards Streamlined Distillation for Large Language Models
Jongwoo Ko
Sungnyun Kim
Tianyi Chen
SeYoung Yun
41
25
0
06 Feb 2024
Generalizable Entity Grounding via Assistance of Large Language Model
Lu Qi
Yi-Wen Chen
Lehan Yang
Tiancheng Shen
Xiangtai Li
Weidong Guo
Yu-Syuan Xu
Ming-Hsuan Yang
VLM
48
9
0
04 Feb 2024
LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model
Yichen Zhu
Minjie Zhu
Ning Liu
Zhicai Ou
Xiaofeng Mou
Jian Tang
63
89
0
04 Jan 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
152
280
0
14 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
1