ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21334
  4. Cited By
HoliTom: Holistic Token Merging for Fast Video Large Language Models
v1v2v3 (latest)

HoliTom: Holistic Token Merging for Fast Video Large Language Models

27 May 2025
Kele Shao
Keda Tao
Can Qin
Haoxuan You
Yang Sui
Huan Wang
    VLM
ArXiv (abs)PDFHTMLHuggingFace (19 upvotes)Github (48★)

Papers citing "HoliTom: Holistic Token Merging for Fast Video Large Language Models"

19 / 19 papers shown
UniComp: Rethinking Video Compression Through Informational Uniqueness
UniComp: Rethinking Video Compression Through Informational Uniqueness
Chao Yuan
Shimin Chen
Minliang Lin
Limeng Qiao
Guanglu Wan
Lin Ma
168
0
0
03 Dec 2025
Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
Yiyu Wang
Xuyang Liu
Xiyan Gui
Xinying Lin
B. Yang
Chenfei Liao
Tailai Chen
Linfeng Zhang
63
0
0
30 Nov 2025
Unboxing the Black Box: Mechanistic Interpretability for Algorithmic Understanding of Neural Networks
Unboxing the Black Box: Mechanistic Interpretability for Algorithmic Understanding of Neural Networks
Bianka Kowalska
Halina Kwaśnicka
179
0
0
24 Nov 2025
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Keda Tao
Kele Shao
Bohan Yu
Weiqiang Wang
Jian Liu
Huan Wang
VLM
255
2
0
18 Nov 2025
StreamingTOM: Streaming Token Compression for Efficient Video Understanding
StreamingTOM: Streaming Token Compression for Efficient Video Understanding
Xueyi Chen
Keda Tao
Kele Shao
Huan Wang
199
3
0
21 Oct 2025
MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification
MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification
Xiaoyi Huang
Junwei Wu
Kejia Zhang
Carl Yang
Zhiming Luo
AAML
189
0
0
29 Sep 2025
Revisiting MLLM Token Technology through the Lens of Classical Visual Coding
Revisiting MLLM Token Technology through the Lens of Classical Visual Coding
Jinming Liu
Junyan Lin
Yuntao Wei
Kele Shao
Keda Tao
Jianguo Huang
Xudong Yang
Zhibo Chen
Huan Wang
Xin Jin
MLLM
141
3
0
19 Aug 2025
TARS: MinMax Token-Adaptive Preference Strategy for MLLM Hallucination Reduction
TARS: MinMax Token-Adaptive Preference Strategy for MLLM Hallucination Reduction
Kejia Zhang
Keda Tao
Zhiming Luo
Chang Liu
Jiasheng Tang
Huan Wang
LRM
287
0
0
29 Jul 2025
When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
Kele Shao
Keda Tao
Kejia Zhang
Sicheng Feng
Mu Cai
Yuzhang Shang
Haoxuan You
Can Qin
Yang Sui
Huan Wang
521
12
0
27 Jul 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Zehao Wang
Senthil Purushwalkam
Caiming Xiong
Siyang Song
Chenhui Xu
Ran Xu
382
5
0
23 Apr 2025
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language ModelComputer Vision and Pattern Recognition (CVPR), 2025
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
...
Jinghua Yan
Y. Bai
P. Sadayappan
Helen Zhou
Bo Yuan
VLM
468
18
0
24 Mar 2025
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Keda Tao
Haoxuan You
Yang Sui
Can Qin
Haoyu Wang
VLMMQ
380
8
0
20 Mar 2025
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen
Guoqiang Gong
Tao He
Yifeng Zhang
Pengzhang Liu
Sicheng Zhao
Guiguang Ding
VLM
410
14
0
14 Mar 2025
Qwen2.5-VL Technical Report
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
720
2,913
0
20 Feb 2025
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and ReactionComputer Vision and Pattern Recognition (CVPR), 2025
Rui Qian
Shuangrui Ding
Xiaoyi Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
258
31
0
06 Jan 2025
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
970
15
0
05 Jan 2025
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
Long Xing
Qidong Huang
Xiaoyi Dong
Jiajie Lu
Pan Zhang
...
Yuhang Cao
Bin Wang
Jiaqi Wang
Feng Wu
Dahua Lin
VLM
337
133
0
22 Oct 2024
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
TempMe: Video Temporal Token Merging for Efficient Text-Video RetrievalInternational Conference on Learning Representations (ICLR), 2024
Leqi Shen
Tianxiang Hao
Tao He
Sicheng Zhao
Pengzhang Liu
Yongjun Bao
Guiguang Ding
Guiguang Ding
451
32
0
02 Sep 2024
Towards Adversarial Robustness via Debiased High-Confidence Logit Alignment
Towards Adversarial Robustness via Debiased High-Confidence Logit Alignment
Kejia Zhang
Juanjuan Weng
Shaozi Li
Shaozi Li
AAML
366
1
0
12 Aug 2024
1
Page 1 of 1