Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2010.06891
Cited By
v1
v2 (latest)
Memformer: A Memory-Augmented Transformer for Sequence Modeling
14 October 2020
Qingyang Wu
Zhenzhong Lan
Kun Qian
Jing Gu
A. Geramifard
Zhou Yu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Memformer: A Memory-Augmented Transformer for Sequence Modeling"
41 / 41 papers shown
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models
Adam Filipek
KELM
146
1
0
03 Oct 2025
Vision encoders should be image size agnostic and task driven
Nedyalko Prisadnikov
Danda Pani Paudel
Yuqian Fu
Luc Van Gool
101
1
0
22 Aug 2025
Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures
Parsa Omidi
Xingshuai Huang
Axel Laborieux
Bahareh Nikpour
Tianyu Shi
A. Eshaghi
193
0
0
14 Aug 2025
Goal-Based Vision-Language Driving
Santosh Patapati
Trisanth Srinivasan
170
0
0
30 Jul 2025
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Yifei Yu
Qian Zhang
Lingfeng Qiao
Di Yin
Fang Li
Jie Wang
Zheyu Chen
Suncong Zheng
Xiaolong Liang
Xingwu Sun
381
9
0
07 Apr 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
625
0
0
10 Feb 2025
Episodic memory in AI agents poses risks that should be studied and mitigated
Chad DeChant
463
5
0
20 Jan 2025
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Neural Information Processing Systems (NeurIPS), 2024
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLM
LRM
363
5
0
20 Nov 2024
ACER: Automatic Language Model Context Extension via Retrieval
Luyu Gao
Yunyi Zhang
Jamie Callan
RALM
177
0
0
11 Oct 2024
Token Turing Machines are Efficient Vision Models
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiravathukal
James C. Davis
Yung-Hsiang Lu
380
2
0
11 Sep 2024
You Only Use Reactive Attention Slice For Long Context Retrieval
Yun Joon Soh
Hanxian Huang
Yuandong Tian
Jishen Zhao
RALM
214
1
0
03 Sep 2024
MambaEVT: Event Stream based Visual Object Tracking using State Space Model
Xiao Wang
Chao wang
Shiao Wang
Xixi Wang
Zhicheng Zhao
Lin Zhu
Bo Jiang
Mamba
190
10
0
20 Aug 2024
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities
To Eun Kim
Alireza Salemi
Andrew Drozdov
Fernando Diaz
Hamed Zamani
371
11
0
17 Jul 2024
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Neural Information Processing Systems (NeurIPS), 2024
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
ALM
LRM
ReLM
ELM
282
151
0
14 Jun 2024
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
J. Kolehmainen
Aditya Gourav
Prashanth Gurunath Shivakumar
Yile Gu
Ankur Gandhe
Ariya Rastrow
Grant P. Strimel
I. Bulyko
270
5
0
13 Jun 2024
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
International Conference on Machine Learning (ICML), 2024
Victor Agostinelli
Sanghyun Hong
Lizhong Chen
KELM
242
3
0
18 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving
Pai Zeng
Zhenyu Ning
Jieru Zhao
Weihao Cui
Mengwei Xu
Liwei Guo
Xusheng Chen
Yizhou Shan
LLMAG
292
5
0
18 May 2024
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
Ali Modarressi
Abdullatif Köksal
Ayyoob Imani
Mohsen Fayyaz
Hinrich Schütze
KELM
617
25
0
17 Apr 2024
On Difficulties of Attention Factorization through Shared Memory
Uladzislau Yorsh
Martin Holevna
Ondrej Bojar
David Herel
119
1
0
31 Mar 2024
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
Cunxiang Wang
Ruoxi Ning
Boqi Pan
Tonghui Wu
Qipeng Guo
...
Guangsheng Bao
Xiangkun Hu
Zheng Zhang
Qian Wang
Yue Zhang
RALM
478
22
0
18 Mar 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
631
149
0
26 Feb 2024
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALM
381
41
0
16 Feb 2024
Sound Source Separation Using Latent Variational Block-Wise Disentanglement
Karim Helwani
M. Togami
Paris Smaragdis
Michael M. Goodwin
BDL
DRL
306
1
0
08 Feb 2024
MEMORYLLM: Towards Self-Updatable Large Language Models
Yu Wang
Yifan Gao
Xiusi Chen
Haoming Jiang
Shiyang Li
...
Zheng Li
Xian Li
Bing Yin
Jingbo Shang
Julian McAuley
KELM
219
39
0
07 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
552
3
0
01 Feb 2024
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Vasu Sharma
Amitava Das
215
41
0
15 Jan 2024
Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing
Zi Yang
Nan Hua
RALM
228
4
0
10 Jan 2024
Uncertainty Guided Global Memory Improves Multi-Hop Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alsu Sagirova
Andrey Kravchenko
RALM
285
1
0
29 Nov 2023
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Yunpeng Huang
Jingwei Xu
Junyu Lai
Zixu Jiang
Taolue Chen
...
Xiaoxing Ma
Lijuan Yang
Zhou Xin
Shupeng Li
Penghao Zhao
LLMAG
KELM
373
101
0
21 Nov 2023
From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers
Shaoxiong Duan
Yining Shi
Wei Xu
281
16
0
18 Oct 2023
A Framework for Inference Inspired by Human Memory Mechanisms
International Conference on Learning Representations (ICLR), 2023
Xiangyu Zeng
Jie Lin
Piao Hu
Ruizheng Huang
Zhicheng Zhang
197
4
0
01 Oct 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Zhiliang Tian
Dacheng Tao
LLMAG
KELM
RALM
599
50
0
29 Aug 2023
A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised Traffic Accident Detection in Driving Videos
Rongqin Liang
Yuanman Li
Yingxin Yi
Jiantao Zhou
Xia Li
258
4
0
27 Jul 2023
Extending Context Window of Large Language Models via Positional Interpolation
Shouyuan Chen
Sherman Wong
Liangjian Chen
Yuandong Tian
436
684
0
27 Jun 2023
Diable: Efficient Dialogue State Tracking as Operations on Tables
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Pietro Lesci
Yoshinari Fujinuma
Momchil Hardalov
Chao Shang
Yassine Benajiba
Lluís Marquez
LMTD
304
8
0
26 May 2023
Landmark Attention: Random-Access Infinite Context Length for Transformers
Neural Information Processing Systems (NeurIPS), 2023
Amirkeivan Mohtashami
Martin Jaggi
LLMAG
325
195
0
25 May 2023
Memory Efficient Neural Processes via Constant Memory Attention Block
International Conference on Machine Learning (ICML), 2023
Leo Feng
Frederick Tung
Hossein Hajimirsadeghi
Yoshua Bengio
Mohamed Osama Ahmed
318
8
0
23 May 2023
HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer
International Conference on Multimodal Interaction (ICMI), 2023
Y. Kim
Dong Won Lee
Paul Pu Liang
Sharifa Alghowinem
C. Breazeal
Hae Won Park
328
5
0
21 May 2023
A Lexical-aware Non-autoregressive Transformer-based ASR Model
Interspeech (Interspeech), 2023
Chong Lin
Kuan-Yu Chen
AI4TS
125
3
0
18 May 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Andrey Kravchenko
LRM
339
111
0
19 Apr 2023
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Victor Agostinelli
Lizhong Chen
286
2
0
17 Apr 2023
1
Page 1 of 1