Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.04415
Cited By
Geometric Analysis of Reasoning Trajectories: A Phase Space Approach to Understanding Valid and Invalid Multi-Hop Reasoning in LLMs
6 October 2024
Javier Marin
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Geometric Analysis of Reasoning Trajectories: A Phase Space Approach to Understanding Valid and Invalid Multi-Hop Reasoning in LLMs"
30 / 30 papers shown
Title
VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning
Run Luo
Renke Shan
Longze Chen
Z. Liu
Lu Wang
Min Yang
Xiaobo Xia
MLLM
VLM
89
0
0
28 Apr 2025
Token Sequence Compression for Efficient Multimodal Computing
Yasmine Omri
Parth Shroff
Thierry Tambe
44
0
0
24 Apr 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Z. Wang
Senthil Purushwalkam
Caiming Xiong
S.
Heng Ji
R. Xu
25
0
0
23 Apr 2025
LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation
Hanning Chen
Yang Ni
Wenjun Huang
Hyunwoo Oh
Yezi Liu
Tamoghno Das
Mohsen Imani
VLM
LRM
31
0
0
15 Apr 2025
Window Token Concatenation for Efficient Visual Large Language Models
Yifan Li
Wentao Bao
Botao Ye
Zhen Tan
Tianlong Chen
Huan Liu
Yu Kong
VLM
29
0
0
05 Apr 2025
TimeSearch: Hierarchical Video Search with Spotlight and Reflection for Human-like Long Video Understanding
Junwen Pan
Rui Zhang
Xin Wan
Yuan Zhang
Ming Lu
Qi She
VLM
36
1
0
02 Apr 2025
QG-VTC: Question-Guided Visual Token Compression in MLLMs for Efficient VQA
Shuai Li
Jian Xu
Xiao-Hui Li
Chao Deng
Lin-Lin Huang
MQ
36
0
0
01 Apr 2025
On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices
Bosung Kim
Kyuhwan Lee
Isu Jeong
Jungmin Cheon
Yeojin Lee
Seulki Lee
VGen
37
1
0
31 Mar 2025
InternVL-X: Advancing and Accelerating InternVL Series with Efficient Visual Token Compression
Dongchen Lu
Yuyao Sun
Zilu Zhang
Leping Huang
Jianliang Zeng
Mao Shu
Huo Cao
34
0
0
27 Mar 2025
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
Rongyu Zhang
Menghang Dong
Yuan Zhang
Liang Heng
Xiaowei Chi
Gaole Dai
Li Du
Dan Wang
Yuan Du
MoE
76
0
0
26 Mar 2025
Beyond Intermediate States: Explaining Visual Redundancy through Language
Dingchen Yang
Bowen Cao
Anran Zhang
Weibo Gu
Winston Hu
Guang Chen
VLM
68
0
0
26 Mar 2025
Scaling Vision Pre-Training to 4K Resolution
Baifeng Shi
Boyi Li
Han Cai
Y. Lu
Sifei Liu
...
Jan Kautz
Song Han
Trevor Darrell
Pavlo Molchanov
Hongxu Yin
CLIP
41
0
0
25 Mar 2025
Growing a Twig to Accelerate Large Vision-Language Models
Zhenwei Shao
Mingyang Wang
Zhou Yu
Wenwen Pan
Yan Yang
Tao Wei
H. Zhang
Ning Mao
Wei Chen
Jun Yu
VLM
53
1
0
18 Mar 2025
AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding
Xiao Wang
Qingyi Si
Jianlong Wu
Shiyu Zhu
Li Cao
Liqiang Nie
VLM
73
2
0
16 Mar 2025
FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Leqi Shen
Guoqiang Gong
Tao He
Yifeng Zhang
Pengzhang Liu
Sicheng Zhao
Guiguang Ding
VLM
63
0
0
14 Mar 2025
Similarity-Aware Token Pruning: Your VLM but Faster
Ahmadreza Jeddi
Negin Baghbanzadeh
Elham Dolatabadi
Babak Taati
3DV
VLM
50
1
0
14 Mar 2025
TokenCarve: Information-Preserving Visual Token Compression in Multimodal Large Language Models
Xudong Tan
Peng Ye
Chongjun Tu
Jianjian Cao
Yaoxin Yang
Lin Zhang
Dongzhan Zhou
Tao Chen
VLM
46
0
0
13 Mar 2025
VideoScan: Enabling Efficient Streaming Video Understanding via Frame-level Semantic Carriers
Ruanjun Li
Yuedong Tan
Yuanming Shi
Jiawei Shao
VLM
62
0
0
12 Mar 2025
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo
Yingying Zhang
X. J. Yang
Kang Wu
Qi Zhu
Lei Liang
Jingdong Chen
Yansheng Li
52
0
0
10 Mar 2025
Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding
Wenxuan Song
Jiayi Chen
Pengxiang Ding
H. Zhao
Wei Zhao
Zhide Zhong
Zongyuan Ge
Jun Ma
Haoang Li
35
2
0
04 Mar 2025
FCoT-VL:Advancing Text-oriented Large Vision-Language Models with Efficient Visual Token Compression
Jianjian Li
Junquan Fan
Feng Tang
Gang Huang
Shitao Zhu
Songlin Liu
Nian Xie
Wulong Liu
Yong Liao
VLM
41
0
0
22 Feb 2025
Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?
Zichen Wen
Yifeng Gao
Weijia Li
Conghui He
Linfeng Zhang
LRM
44
0
0
17 Feb 2025
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
49
3
0
05 Jan 2025
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
Xiao Wang
Qingyi Si
Jianlong Wu
Shiyu Zhu
Li Cao
Liqiang Nie
VLM
54
6
0
29 Dec 2024
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Z. Li
Yibing Song
K. Wang
Zhangyang Wang
Yang You
66
0
0
04 Dec 2024
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
Yiwu Zhong
Zhuoming Liu
Yin Li
Liwei Wang
73
1
0
04 Dec 2024
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
Qizhe Zhang
Aosong Cheng
Ming Lu
Zhiyong Zhuo
Minqi Wang
Jiajun Cao
Shaobo Guo
Qi She
Shanghang Zhang
VLM
73
11
0
02 Dec 2024
ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models
Xubing Ye
Yukang Gan
Yixiao Ge
Xiao Zhang
Yansong Tang
90
5
0
30 Nov 2024
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
Long Xing
Qidong Huang
Xiaoyi Dong
Jiajie Lu
Pan Zhang
...
Yuhang Cao
Conghui He
Jiaqi Wang
Feng Wu
Dahua Lin
VLM
33
25
0
22 Oct 2024
Unveiling the Tapestry of Consistency in Large Vision-Language Models
Yuan Zhang
Fei Xiao
Tao Huang
Chun-Kai Fan
Hongyuan Dong
Jiawen Li
Jiacong Wang
Kuan Cheng
Shanghang Zhang
Haoyuan Guo
35
0
0
23 May 2024
1