ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.06583
  4. Cited By
S4ND: Modeling Images and Videos as Multidimensional Signals Using State
  Spaces

S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces

12 October 2022
Eric N. D. Nguyen
Karan Goel
Albert Gu
Gordon W. Downs
Preey Shah
Tri Dao
S. Baccus
Christopher Ré
    VLM
ArXivPDFHTML

Papers citing "S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces"

7 / 7 papers shown
Title
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
56
36
0
24 Feb 2025
GG-SSMs: Graph-Generating State Space Models
GG-SSMs: Graph-Generating State Space Models
Nikola Zubić
Davide Scaramuzza
Mamba
86
1
0
17 Dec 2024
Understanding the differences in Foundation Models: Attention, State
  Space Models, and Recurrent Neural Networks
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Jerome Sieber
Carmen Amo Alonso
A. Didier
M. Zeilinger
Antonio Orvieto
AAML
42
7
0
24 May 2024
How to Train Your HiPPO: State Space Models with Generalized Orthogonal
  Basis Projections
How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Albert Gu
Isys Johnson
Aman Timalsina
Atri Rudra
Christopher Ré
Mamba
93
88
0
24 Jun 2022
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel
  Sizes
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes
David W. Romero
Robert-Jan Bruintjes
Jakub M. Tomczak
Erik J. Bekkers
Mark Hoogendoorn
J. C. V. Gemert
74
81
0
15 Oct 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
231
573
0
22 Apr 2021
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1