Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.08024
Cited By
Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models
12 June 2024
Shimin Chen
Yitian Yuan
Shaoxiang Chen
Zequn Jie
Lin Ma
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Fewer Tokens and Fewer Videos: Extending Video Understanding Abilities in Large Vision-Language Models"
6 / 6 papers shown
UniComp: Rethinking Video Compression Through Informational Uniqueness
Chao Yuan
Shimin Chen
Minliang Lin
Limeng Qiao
Guanglu Wan
Lin Ma
209
1
0
03 Dec 2025
Free-MoRef: Instantly Multiplexing Context Perception Capabilities of Video-MLLMs within Single Inference
Kuo Wang
Quanlong Zheng
Junlin Xie
Yanhao Zhang
Jinguo Luo
Haonan Lu
Guanbin Li
Fan Zhou
Guanbin Li
VLM
130
1
0
04 Aug 2025
Beyond Intermediate States: Explaining Visual Redundancy through Language
Dingchen Yang
Bowen Cao
Anran Zhang
Weibo Gu
Winston Hu
Guang Chen
VLM
310
4
0
26 Mar 2025
Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zenghui Ding
Xianjun Yang
Yining Sun
1.1K
16
0
21 Nov 2024
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Chenliang Xu
Jiebo Luo
Chenliang Xu
VLM
923
225
0
29 Dec 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
729
265
0
12 Jun 2023
1
Page 1 of 1