Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.08136
Cited By
Efficient Long Sequence Modeling via State Space Augmented Transformer
15 December 2022
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Long Sequence Modeling via State Space Augmented Transformer"
36 / 36 papers shown
Title
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
94
1
0
01 May 2025
From S4 to Mamba: A Comprehensive Survey on Structured State Space Models
Shriyank Somvanshi
Md Monzurul Islam
Mahmuda Sultana Mimi
Sazzad Bin Bashar Polock
Gaurab Chhetri
Subasish Das
Mamba
AI4TS
40
0
0
22 Mar 2025
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
Suyu Ge
Xihui Lin
Yunan Zhang
Jiawei Han
Hao Peng
31
4
0
02 Oct 2024
Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling
Harry Jake Cunningham
Giorgio Giannone
Mingtian Zhang
M. Deisenroth
28
0
0
18 Aug 2024
Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba
Yuchen Zou
Yineng Chen
Zuchao Li
Lefei Zhang
Hai Zhao
45
1
0
24 Jun 2024
Slot State Space Models
Jindong Jiang
Fei Deng
Gautam Singh
Minseung Lee
Sungjin Ahn
39
4
0
18 Jun 2024
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences
Zicheng Liu
Siyuan Li
Li Wang
Zedong Wang
Yunfan Liu
Stan Z. Li
33
7
0
12 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
64
55
0
11 Jun 2024
A Survey of Transformer Enabled Time Series Synthesis
Alexander Sommers
Logan Cummins
Sudip Mittal
Shahram Rahimi
Maria Seale
Joseph Jaboure
Thomas Arnold
AI4TS
33
2
0
04 Jun 2024
Dimba: Transformer-Mamba Diffusion Models
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Debang Li
Youqiang Zhang
Junshi Huang
Mamba
54
16
0
03 Jun 2024
Encoding and Controlling Global Semantics for Long-form Video Question Answering
Thong Nguyen
Zhiyuan Hu
Xiaobao Wu
Cong-Duy Nguyen
See-Kiong Ng
A. Luu
35
2
0
30 May 2024
SMR: State Memory Replay for Long Sequence Modeling
Biqing Qi
Junqi Gao
Kaiyan Zhang
Dong Li
Jianxing Liu
Ligang Wu
Bowen Zhou
21
5
0
27 May 2024
MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space
Jiangwei Weng
Zhiqiang Yan
Ying Tai
J. Qian
Jian Yang
Jun Li
Mamba
24
10
0
25 May 2024
Matten: Video Generation with Mamba-Attention
Yu Gao
Jiancheng Huang
Xiaopeng Sun
Zequn Jie
Yujie Zhong
Lin Ma
64
12
0
05 May 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
30
38
0
24 Apr 2024
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory
Zicheng Liu
Li Wang
Siyuan Li
Zedong Wang
Haitao Lin
Stan Z. Li
VLM
27
4
0
17 Apr 2024
State Space Model for New-Generation Network Alternative to Transformers: A Survey
Xiao Wang
Shiao Wang
Yuhe Ding
Yuehang Li
Wentao Wu
...
Bowei Jiang
Chenglong Li
Yaowei Wang
Yonghong Tian
Jin Tang
Mamba
33
49
0
15 Apr 2024
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber
Barak Lenz
Hofit Bata
Gal Cohen
Jhonathan Osin
...
Nir Ratner
N. Rozen
Erez Shwartz
Mor Zusman
Y. Shoham
21
207
0
28 Mar 2024
Mastering Memory Tasks with World Models
Mohammad Reza Samsami
Artem Zholus
Janarthanan Rajendran
Sarath Chandar
CLL
OffRL
27
21
0
07 Mar 2024
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Jongho Park
Jaeseung Park
Zheyang Xiong
Nayoung Lee
Jaewoong Cho
Samet Oymak
Kangwook Lee
Dimitris Papailiopoulos
19
69
0
06 Feb 2024
LOCOST: State-Space Models for Long Document Abstractive Summarization
Florian Le Bronnec
Song Duong
Mathieu Ravaut
Alexandre Allauzen
Nancy F. Chen
Vincent Guigue
Alberto Lumbreras
Laure Soulier
Patrick Gallinari
40
7
0
31 Jan 2024
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAG
KELM
28
54
0
21 Nov 2023
Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention
Ziwei He
Jian Yuan
Le Zhou
Jingwen Leng
Bo Jiang
17
0
0
13 Nov 2023
Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer
Qingru Zhang
Dhananjay Ram
Cole Hawkins
Sheng Zha
Tuo Zhao
27
15
0
19 Oct 2023
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Paul Mattes
Rainer Schlosser
R. Herbrich
16
4
0
08 Oct 2023
Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors
Ido Amos
Jonathan Berant
Ankit Gupta
20
24
0
04 Oct 2023
Attention Is Not All You Need Anymore
Zhe Chen
19
3
0
15 Aug 2023
Facing Off World Model Backbones: RNNs, Transformers, and S4
Fei Deng
Junyeong Park
Sungjin Ahn
25
24
0
05 Jul 2023
2-D SSM: A General Spatial Layer for Visual Transformers
Ethan Baron
Itamar Zimerman
Lior Wolf
23
14
0
11 Jun 2023
DiffECG: A Versatile Probabilistic Diffusion Model for ECG Signals Synthesis
Nour Neifar
A. Ben-Hamadou
Afef Mdhaffar
M. Jmaiel
DiffM
20
4
0
02 Jun 2023
Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati
Itamar Zimerman
Lior Wolf
27
9
0
24 May 2023
A Survey on Long Text Modeling with Transformers
Zican Dong
Tianyi Tang
Lunyi Li
Wayne Xin Zhao
VLM
19
52
0
28 Feb 2023
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
71
220
0
21 Feb 2022
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
578
0
12 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1