Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.14526
Cited By
Selective Structured State-Spaces for Long-Form Video Understanding
25 March 2023
Jue Wang
Wenjie Zhu
Pichao Wang
Xiang Yu
Linda Liu
Mohamed Omar
Raffay Hamid
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Selective Structured State-Spaces for Long-Form Video Understanding"
19 / 19 papers shown
Title
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad
Vibhav Vineet
Y. S. Rawat
VLM
84
1
0
11 Mar 2025
A Reverse Mamba Attention Network for Pathological Liver Segmentation
Jun Zeng
Debesh Jha
Ertugrul Aktas
Elif Keles
A. Medetalibeyoğlu
...
Robert Lewandowski
Daniela Ladner
Amir Borhani
Gorkem Durak
Ulas Bagci
Mamba
49
1
0
23 Feb 2025
GLAM: Global-Local Variation Awareness in Mamba-based World Model
Qian He
Wenqi Liang
Chunhui Hao
Gan Sun
Jiandong Tian
41
0
0
21 Jan 2025
MambaHSI: Spatial-Spectral Mamba for Hyperspectral Image Classification
Yapeng Li
Yong Luo
L. Zhang
Zengmao Wang
Bo Du
Mamba
58
57
0
10 Jan 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
106
607
0
31 Dec 2024
FACTS: A Factored State-Space Framework For World Modelling
Li Nanbo
Firas Laakom
Yucheng Xu
Wenyi Wang
Jürgen Schmidhuber
AI4TS
77
0
0
28 Oct 2024
HASN: Hybrid Attention Separable Network for Efficient Image Super-resolution
Weifeng Cao
Xiaoyan Lei
Jun Shi
Wanyong Liang
Jie Liu
Zongfei Bai
SupR
24
0
0
13 Oct 2024
Enhancing Long Video Understanding via Hierarchical Event-Based Memory
Dingxin Cheng
Mingda Li
Jingyu Liu
Yongxin Guo
Bin Jiang
Qingbin Liu
Xi Chen
Bo Zhao
22
4
0
10 Sep 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
43
24
0
19 Aug 2024
Learning Video Context as Interleaved Multimodal Sequences
S. Shao
Pengchuan Zhang
Y. Li
Xide Xia
A. Meso
Ziteng Gao
Jinheng Xie
N. Holliman
Mike Zheng Shou
41
5
0
31 Jul 2024
MambaLRP: Explaining Selective State Space Sequence Models
F. Jafari
G. Montavon
Klaus-Robert Müller
Oliver Eberle
Mamba
54
9
0
11 Jun 2024
LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling
Yaohua Zha
Naiqi Li
Yanzi Wang
Tao Dai
Hang Guo
Bin Chen
Zhi Wang
Zhihao Ouyang
Shu-Tao Xia
Mamba
42
8
0
27 May 2024
Koala: Key frame-conditioned long video-LLM
Reuben Tan
Ximeng Sun
Ping Hu
Jui-hsien Wang
Hanieh Deilamsalehy
Bryan A. Plummer
Bryan C. Russell
Kate Saenko
38
35
0
05 Apr 2024
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan
Weiyun Wang
Zhe Chen
Xizhou Zhu
Lewei Lu
Tong Lu
Yu Qiao
Hongsheng Li
Jifeng Dai
Wenhai Wang
ViT
38
44
0
04 Mar 2024
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
27
9
0
24 May 2023
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
278
1,978
0
09 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
193
419
0
01 Feb 2021
AdaFrame: Adaptive Frame Selection for Fast Video Recognition
Zuxuan Wu
Caiming Xiong
Chih-Yao Ma
R. Socher
L. Davis
116
194
0
29 Nov 2018
1