Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.23666
Cited By
v1
v2 (latest)
LoLA: Low-Rank Linear Attention With Sparse Caching
29 May 2025
Luke McDermott
Robert W. Heath Jr.
Rahul Parhi
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (30168★)
Papers citing
"LoLA: Low-Rank Linear Attention With Sparse Caching"
6 / 6 papers shown
Native Hybrid Attention for Efficient Sequence Modeling
Jusen Du
Jiaxi Hu
Tao Zhang
Weigao Sun
Yu Cheng
207
3
0
08 Oct 2025
Mitigating Diffusion Model Hallucinations with Dynamic Guidance
Kostas Triaridis
Alexandros Graikos
Aggelina Chatziagapi
Grigorios G. Chrysos
Dimitris Samaras
DiffM
111
0
0
06 Oct 2025
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
Piotr Nawrot
Robert Li
Renjie Huang
Sebastian Ruder
Kelly Marchisio
Edoardo Ponti
344
12
0
24 Apr 2025
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Neural Information Processing Systems (NeurIPS), 2024
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
427
51
0
19 Aug 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
...
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
612
186
0
05 Jul 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Haoran Pan
Chen Liang
Weizhu Chen
Mamba
377
111
0
11 Jun 2024
1