Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.07146
Cited By
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
11 September 2024
Yu Zhang
Songlin Yang
Ruijie Zhu
Yue Zhang
Leyang Cui
Yiqiao Wang
B. Wang
Freda Shi
Bailin Wang
Wei Bi
P. Zhou
Guohong Fu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gated Slot Attention for Efficient Linear-Time Sequence Modeling"
8 / 8 papers shown
Title
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
81
0
0
22 Apr 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
52
1
0
07 Mar 2025
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Disen Lan
Weigao Sun
Jiaxi Hu
Jusen Du
Yu-Xi Cheng
64
0
0
03 Mar 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu-Xi Cheng
KELM
75
3
0
19 Feb 2025
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong
Shivam Agarwal
Yizhe Zhang
Jiacheng Ye
Lin Zheng
...
Peilin Zhao
W. Bi
Jiawei Han
Hao Peng
Lingpeng Kong
AI4CE
46
14
0
23 Oct 2024
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
89
77
0
01 Feb 2024
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
H. H. Mao
52
20
0
09 Oct 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
68
220
0
21 Feb 2022
1