Papers
Communities
Organizations
Events
Blog
Pricing
Feedback
Contact Sales
Search
Open menu
Home
Papers
2412.00927
Cited By
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
1 December 2024
Weiming Ren
Huan Yang
Jie Min
Cong Wei
Lei Ma
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (29 upvotes)
Papers citing
"VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation"
3 / 3 papers shown
Title
VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation
Wentao Ma
Weiming Ren
Yiming Jia
Zhuofeng Li
Ping Nie
Ge Zhang
Wenhu Chen
129
3
0
20 May 2025
SmolVLM: Redefining small and efficient multimodal models
Andres Marafioti
Orr Zohar
Miquel Farré
Merve Noyan
Elie Bakouch
...
Hugo Larcher
Mathieu Morlon
Lewis Tunstall
Leandro von Werra
Thomas Wolf
VLM
136
39
0
07 Apr 2025
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
Weiming Ren
Wentao Ma
Huan Yang
Cong Wei
Ge Zhang
Lei Ma
Mamba
146
7
0
14 Mar 2025
1