Memorizing Transformers

International Conference on Learning Representations (ICLR), 2022

16 March 2022

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)Github (35439★)

Papers citing "Memorizing Transformers"

50 / 157 papers shown

Learning Plug-and-play Memory for Guiding Video Diffusion Models

343

24 Nov 2025

BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models

Chandra Vamsi Krishna Alla

Harish Naidu Gaddam

Manohar Kommi

RALM

351

07 Nov 2025

Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism

115

30 Oct 2025

Kimi Linear: An Expressive, Efficient Attention Architecture

...

180

30 Oct 2025

From Masks to Worlds: A Hitchhiker's Guide to World Models

245

23 Oct 2025

NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning

190

22 Oct 2025

Taming a Retrieval Framework to Read Images in Humanlike Manner for Augmenting Generation of MLLMs

151

12 Oct 2025

Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models

S M Rafiuddin

Muntaha Nujat Khan

RALM KELM

187

09 Oct 2025

Artificial Hippocampus Networks for Efficient Long-Context Modeling

214

08 Oct 2025

Pretraining with hierarchical memories: separating long-tail and common knowledge

314

29 Sep 2025

SimulRAG: Simulator-based RAG for Grounding LLMs in Long-form Scientific QA

146

29 Sep 2025

A Survey of Long-Document Retrieval in the PLM and LLM Era

260

09 Sep 2025

Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context

Maitreyi Chatterjee

Devansh Agarwal

RALM KELM

191

18 Aug 2025

Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures

263

14 Aug 2025

Cognitive Workspace: Active Memory Management for LLMs -- An Empirical Study of Functional Infinite Context

Tao An

RALM

140

08 Aug 2025

FA-INR: Adaptive Implicit Neural Representations for Interpretable Exploration of Simulation Ensembles

299

07 Jun 2025

Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

338

26 May 2025

Vaiage: A Multi-Agent Solution to Personalized Travel Planning

186

16 May 2025

Cocktail: Chunk-Adaptive Mixed-Precision Quantization for Long-Context LLM InferenceDesign, Automation and Test in Europe (DATE), 2025

473

30 Mar 2025

MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation

797

18 Feb 2025

Associative Recurrent Memory Transformer

402

17 Feb 2025

Vision-centric Token Compression in Large Language Model

783

02 Feb 2025

Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer

285

18 Dec 2024

Emotional RAG: Enhancing Role-Playing Agents through Emotional Retrieval

879

30 Oct 2024

HART: Efficient Visual Generation with Hybrid Autoregressive TransformerInternational Conference on Learning Representations (ICLR), 2024

Enze Xie

Han Cai

587

220

14 Oct 2024

MELODI: Exploring Memory Compression for Long ContextsInternational Conference on Learning Representations (ICLR), 2024

241

04 Oct 2024

Beyond Prompts: Dynamic Conversational Benchmarking of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024

David Castillo-Bolado

Joseph Davidson

Finlay Gray

Marek Rosa

321

30 Sep 2024

PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference

Zeyu Zhang

Haiying Shen

VLM

424

23 Sep 2024

Towards LifeSpan Cognitive Systems

Yu Wang

...

Wei Wang

Heng Ji

Julian McAuley

KELM CLL

1.1K

20 Sep 2024

Schrodinger's Memory: Large Language Models

Wei Wang

Qing Li

398

16 Sep 2024

Introducing Gating and Context into Temporal Action Detection

Francois Bremond

345

06 Sep 2024

QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement LearningInternational Conference on Software Engineering (ICSE), 2024

Yuriy Brun

716

17 Aug 2024

Towards flexible perception with visual memory

604

15 Aug 2024

Human-inspired Episodic Memory for Infinite Context LLMs

493

12 Jul 2024

$$\text{Memory}^3$: Language Modeling with Explicit Memory$

\text{Memory}^3

: Language Modeling with Explicit Memory

Zhiyu Li

...

Weinan E

298

01 Jul 2024

BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-HaystackNeural Information Processing Systems (NeurIPS), 2024

Artyom Sorokin

RALM ALM LRM ReLM ELM

331

188

14 Jun 2024

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Yadong Lu

Weizhu Chen

554

140

11 Jun 2024

Memorization in deep learning: A survey

Jiaheng Wei

Yanjun Zhang

Leo Yu Zhang

Yang Xiang

387

06 Jun 2024

Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning

Mike Zheng Shou

330

04 Jun 2024

Extended Mind Transformers

Phoebe Klett

Thomas Ahle

RALM

166

04 Jun 2024

Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs

Jialiang Xu

Michael Moor

J. Leskovec

260

29 May 2024

XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference

405

28 May 2024

SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language Model Itself

Jun Gao

Ziqiang Cao

Wenjie Li

340

27 May 2024

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Jonathan Ragan-Kelley

336

109

21 May 2024

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

718

17 Apr 2024

Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

250

16 Apr 2024

kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies

Shuyang Sun

Christian Schroeder de Witt

Juil Sock

VLM CLL

365

15 Apr 2024

TransformerFAM: Feedback attention is working memory

493

14 Apr 2024

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Tsendsuren Munkhdalai

Manaal Faruqui

Siddharth Gopal

LRM LLMAG CLL

369

188

10 Apr 2024

Streaming Dense Video Captioning

292

01 Apr 2024