Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.13600
Cited By
VL-Mamba: Exploring State Space Models for Multimodal Learning
20 March 2024
Yanyuan Qiao
Zheng Yu
Longteng Guo
Sihan Chen
Zijia Zhao
Mingzhen Sun
Qi Wu
Jing Liu
Mamba
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VL-Mamba: Exploring State Space Models for Multimodal Learning"
19 / 19 papers shown
Title
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
37
0
0
22 Mar 2025
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering
Md. Mohaiminul Islam
Tushar Nagarajan
Huiyu Wang
Gedas Bertasius
Lorenzo Torresani
55
0
0
12 Mar 2025
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
Saeed Ranjbar Alvar
Gursimran Singh
Mohammad Akbari
Yong Zhang
VLM
68
0
0
04 Mar 2025
Linear Attention Modeling for Learned Image Compression
Donghui Feng
Zhengxue Cheng
Shen Wang
Ronghua Wu
Hongwei Hu
Guo Lu
Li-Na Song
60
1
0
09 Feb 2025
Spatial-Mamba: Effective Visual State Space Models via Structure-aware State Fusion
Chaodong Xiao
Minghan Li
Zhengqiang Zhang
Deyu Meng
Lei Zhang
Mamba
45
4
0
19 Oct 2024
UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba
Li Wu
Wenbin Pei
Jiulong Jiao
Qiang Zhang
Mamba
AI4TS
12
2
0
15 Oct 2024
QMambaBSR: Burst Image Super-Resolution with Query State Space Model
Xin Di
Long Peng
Peizhe Xia
Wenbo Li
Renjing Pei
Yang Cao
Yang Wang
Zheng-Jun Zha
44
6
0
16 Aug 2024
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Qi Lv
Xiang Deng
Gongwei Chen
Michael Yu Wang
Liqiang Nie
55
6
0
08 Jun 2024
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Byung-Kwan Lee
Chae Won Kim
Beomchan Park
Yonghyun Ro
MLLM
LRM
22
17
0
24 May 2024
MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection
Haoyang He
Yuhu Bai
Jiangning Zhang
Qingdong He
Hongxu Chen
Zhenye Gan
Chengjie Wang
Xiangtai Li
Guanzhong Tian
Lei Xie
Mamba
47
32
0
09 Apr 2024
VM-UNet: Vision Mamba UNet for Medical Image Segmentation
Jiacheng Ruan
Suncheng Xiang
Mamba
61
241
0
04 Feb 2024
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
Jun Ma
Feifei Li
Bo Wang
Mamba
74
314
0
09 Jan 2024
LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model
Yichen Zhu
Minjie Zhu
Ning Liu
Zhicai Ou
Xiaofeng Mou
Jian Tang
63
89
0
04 Jan 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
152
280
0
14 Oct 2023
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
198
883
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
1