Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.05803
Cited By
Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference
28 January 2025
Zhihang Lin
Mingbao Lin
Luxi Lin
Rongrong Ji
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference"
14 / 14 papers shown
Title
PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models
M. Dhouib
Davide Buscaldi
Sonia Vanier
A. Shabou
VLM
34
0
0
11 Apr 2025
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
...
Jinghua Yan
Y. Bai
P. Sadayappan
Xia Hu
Bo Yuan
VLM
49
0
0
24 Mar 2025
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference
Hao Yin
Guangzong Si
Zilei Wang
43
0
0
17 Mar 2025
Multi-Cue Adaptive Visual Token Pruning for Large Vision-Language Models
Bozhi Luan
Wengang Zhou
Hao Feng
Zhe Wang
Xiaosong Li
H. Li
VLM
61
0
0
11 Mar 2025
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
Saeed Ranjbar Alvar
Gursimran Singh
Mohammad Akbari
Yong Zhang
VLM
68
0
0
04 Mar 2025
Beyond Token Compression: A Training-Free Reduction Framework for Efficient Visual Processing in MLLMs
Hongliang Li
Jiaxin Zhang
Wenhui Liao
Dezhi Peng
Kai Ding
Lianwen Jin
OffRL
MQ
68
0
0
31 Jan 2025
FocusLLaVA: A Coarse-to-Fine Approach for Efficient and Effective Visual Token Compression
Yuke Zhu
Chi Xie
Shuang Liang
Bo Zheng
Sheng Guo
59
1
0
21 Nov 2024
AVG-LLaVA: A Large Multimodal Model with Adaptive Visual Granularity
Zhibin Lan
Liqiang Niu
Fandong Meng
Wenbo Li
Jie Zhou
Jinsong Su
VLM
22
2
0
20 Sep 2024
LLAVADI: What Matters For Multimodal Large Language Models Distillation
Shilin Xu
Xiangtai Li
Haobo Yuan
Lu Qi
Yunhai Tong
Ming-Hsuan Yang
21
0
0
28 Jul 2024
DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming
Jiaxin Zhang
Wentao Yang
Songxuan Lai
Zecheng Xie
Lianwen Jin
24
15
0
27 Jun 2024
Efficient Multimodal Large Language Models: A Survey
Yizhang Jin
Jian Li
Yexin Liu
Tianjun Gu
Kai Wu
...
Xin Tan
Zhenye Gan
Yabiao Wang
Chengjie Wang
Lizhuang Ma
LRM
39
44
0
17 May 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
1