ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.08913
  4. Cited By
Memorizing Transformers

Memorizing Transformers

International Conference on Learning Representations (ICLR), 2022
16 March 2022
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
    RALM
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)Github (35439★)

Papers citing "Memorizing Transformers"

50 / 157 papers shown
Learning Plug-and-play Memory for Guiding Video Diffusion Models
Learning Plug-and-play Memory for Guiding Video Diffusion Models
Selena Song
Ziming Xu
Zijun Zhang
Kun Zhou
Jiaxian Guo
Lianhui Qin
Biwei Huang
VGen
343
1
0
24 Nov 2025
BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models
BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models
Chandra Vamsi Krishna Alla
Harish Naidu Gaddam
Manohar Kommi
RALM
351
0
0
07 Nov 2025
Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
Yuhua Jiang
Shuang Cheng
Yihao Liu
Ermo Hua
Che Jiang
Weigao Sun
Yu Cheng
Feifei Gao
Biqing Qi
Bowen Zhou
CLLKELMMoE
115
0
0
30 Oct 2025
Kimi Linear: An Expressive, Efficient Attention Architecture
Kimi Linear: An Expressive, Efficient Attention Architecture
Kimi Team
Yu Zhang
Zongyu Lin
Xingcheng Yao
J. Hu
...
Guokun Lai
Yuxin Wu
Xinyu Zhou
Zhilin Yang
Yulun Du
180
49
0
30 Oct 2025
From Masks to Worlds: A Hitchhiker's Guide to World Models
From Masks to Worlds: A Hitchhiker's Guide to World Models
Jinbin Bai
Yu Lei
H. Wu
Yuchen Zhu
Shufan Li
Yi Xin
Xiangtai Li
Molei Tao
Aditya Grover
Ming-Hsuan Yang
VGenSyDa
245
4
0
23 Oct 2025
NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
NeSyPr: Neurosymbolic Proceduralization For Efficient Embodied Reasoning
Wonje Choi
Jooyoung Kim
Honguk Woo
LRM
190
1
0
22 Oct 2025
Taming a Retrieval Framework to Read Images in Humanlike Manner for Augmenting Generation of MLLMs
Taming a Retrieval Framework to Read Images in Humanlike Manner for Augmenting Generation of MLLMs
SuYang Xi
Chenxi Yang
Hong Ding
Yiqing Ni
Catherine C. Liu
Yunhao Liu
Chengqi Zhang
LRM
151
1
0
12 Oct 2025
Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
S M Rafiuddin
Muntaha Nujat Khan
RALMKELM
187
1
0
09 Oct 2025
Artificial Hippocampus Networks for Efficient Long-Context Modeling
Artificial Hippocampus Networks for Efficient Long-Context Modeling
Yunhao Fang
Weihao Yu
Shu Zhong
Qinghao Ye
Xuehan Xiong
Lai Wei
214
5
0
08 Oct 2025
Pretraining with hierarchical memories: separating long-tail and common knowledge
Pretraining with hierarchical memories: separating long-tail and common knowledge
Hadi Pouransari
David Grangier
C Thomas
Michael Kirchhof
Oncel Tuzel
KELMMoERALMLRM
314
5
0
29 Sep 2025
SimulRAG: Simulator-based RAG for Grounding LLMs in Long-form Scientific QA
SimulRAG: Simulator-based RAG for Grounding LLMs in Long-form Scientific QA
Haozhou Xu
D. Wu
Matteo Chinazzi
Ruijia Niu
Rose Yu
Yi-An Ma
RALM
146
1
0
29 Sep 2025
A Survey of Long-Document Retrieval in the PLM and LLM Era
A Survey of Long-Document Retrieval in the PLM and LLM Era
Minghan Li
Miyang Luo
Tianrui Lv
Yishuai Zhang
Siqi Zhao
Ercong Nie
Guodong Zhou
RALMKELM
260
6
0
09 Sep 2025
Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context
Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context
Maitreyi Chatterjee
Devansh Agarwal
RALMKELM
191
1
0
18 Aug 2025
Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures
Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures
Parsa Omidi
Xingshuai Huang
Axel Laborieux
Bahareh Nikpour
Tianyu Shi
A. Eshaghi
263
0
0
14 Aug 2025
Cognitive Workspace: Active Memory Management for LLMs -- An Empirical Study of Functional Infinite Context
Cognitive Workspace: Active Memory Management for LLMs -- An Empirical Study of Functional Infinite Context
Tao An
RALM
140
2
0
08 Aug 2025
FA-INR: Adaptive Implicit Neural Representations for Interpretable Exploration of Simulation Ensembles
FA-INR: Adaptive Implicit Neural Representations for Interpretable Exploration of Simulation Ensembles
Ziwei Li
Yuhan Duan
Tianyu Xiong
Yi-Tang Chen
Wei-Lun Chao
H. Shen
AI4CE
299
1
0
07 Jun 2025
Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work Generation
Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Xiaochuan Liu
Ruihua Song
Xiting Wang
Xu Chen
338
3
0
26 May 2025
Vaiage: A Multi-Agent Solution to Personalized Travel Planning
Vaiage: A Multi-Agent Solution to Personalized Travel Planning
Binwen Liu
Jiexi Ge
Jiamin Wang
LLMAG
186
5
0
16 May 2025
Cocktail: Chunk-Adaptive Mixed-Precision Quantization for Long-Context LLM Inference
Cocktail: Chunk-Adaptive Mixed-Precision Quantization for Long-Context LLM InferenceDesign, Automation and Test in Europe (DATE), 2025
Wei Tao
Bin Zhang
Xiaoyang Qu
Jiguang Wan
Jianzong Wang
473
4
0
30 Mar 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffMVGen
797
10
0
18 Feb 2025
Associative Recurrent Memory Transformer
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Andrey Kravchenko
402
13
0
17 Feb 2025
Vision-centric Token Compression in Large Language Model
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
Xiangbo Shu
Jinhui Tang
VLM
783
12
0
02 Feb 2025
Memorizing SAM: 3D Medical Segment Anything Model with Memorizing
  Transformer
Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer
Xinyuan Shao
Yiqing Shen
Mathias Unberath
MedIm
285
1
0
18 Dec 2024
Emotional RAG: Enhancing Role-Playing Agents through Emotional Retrieval
Emotional RAG: Enhancing Role-Playing Agents through Emotional Retrieval
Le Huang
Hengzhi Lan
Zijun Sun
Chuan Shi
Ting Bai
879
23
0
30 Oct 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
HART: Efficient Visual Generation with Hybrid Autoregressive TransformerInternational Conference on Learning Representations (ICLR), 2024
Haotian Tang
Yecheng Wu
Shang Yang
Enze Xie
Junsong Chen
Junyu Chen
Zhuoyang Zhang
Han Cai
Yaojie Lu
Song Han
587
220
0
14 Oct 2024
MELODI: Exploring Memory Compression for Long Contexts
MELODI: Exploring Memory Compression for Long ContextsInternational Conference on Learning Representations (ICLR), 2024
Yinpeng Chen
DeLesley Hutchins
Aren Jansen
Andrey Zhmoginov
David Racz
Jesper Andersen
241
4
0
04 Oct 2024
Beyond Prompts: Dynamic Conversational Benchmarking of Large Language
  Models
Beyond Prompts: Dynamic Conversational Benchmarking of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
David Castillo-Bolado
Joseph Davidson
Finlay Gray
Marek Rosa
321
19
0
30 Sep 2024
PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference
PecSched: Preemptive and Efficient Cluster Scheduling for LLM Inference
Zeyu Zhang
Haiying Shen
VLM
424
1
0
23 Sep 2024
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELMCLL
1.1K
11
0
20 Sep 2024
Schrodinger's Memory: Large Language Models
Schrodinger's Memory: Large Language Models
Wei Wang
Qing Li
398
4
0
16 Sep 2024
Introducing Gating and Context into Temporal Action Detection
Introducing Gating and Context into Temporal Action Detection
Aglind Reka
Diana Laura Borza
Dominick Reilly
Michal Balazia
Francois Bremond
345
1
0
06 Sep 2024
QEDCartographer: Automating Formal Verification Using Reward-Free
  Reinforcement Learning
QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement LearningInternational Conference on Software Engineering (ICSE), 2024
Alex Sanchez-Stern
Abhishek Varghese
Zhanna Kaufman
Dylan Zhang
Talia Ringer
Yuriy Brun
716
4
0
17 Aug 2024
Towards flexible perception with visual memory
Towards flexible perception with visual memory
Robert Geirhos
P. Jaini
Austin Stone
Sourabh Medapati
Xi Yi
G. Toderici
Abhijit Ogale
Jonathon Shlens
604
7
0
15 Aug 2024
Human-inspired Episodic Memory for Infinite Context LLMs
Human-inspired Episodic Memory for Infinite Context LLMs
Zafeirios Fountas
Martin A Benfeghoul
Adnan Oomerjee
Fenia Christopoulou
Gerasimos Lampouras
H. Ammar
Jun Wang
493
32
0
12 Jul 2024
$\text{Memory}^3$: Language Modeling with Explicit Memory
Memory3\text{Memory}^3Memory3: Language Modeling with Explicit Memory
Hongkang Yang
Peng Liu
Wenjin Wang
Huayi Lai
Zhiyu Li
...
Yu Yu
Kai Chen
Feiyu Xiong
Linpeng Tang
Weinan E
298
38
0
01 Jul 2024
BABILong: Testing the Limits of LLMs with Long Context
  Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-HaystackNeural Information Processing Systems (NeurIPS), 2024
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Andrey Kravchenko
RALMALMLRMReLMELM
331
188
0
14 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Haoran Pan
Chen Liang
Weizhu Chen
Mamba
554
140
0
11 Jun 2024
Memorization in deep learning: A survey
Memorization in deep learning: A survey
Jiaheng Wei
Yanjun Zhang
Leo Yu Zhang
Ming Ding
Chao Chen
Kok-Leong Ong
Jun Zhang
Yang Xiang
387
25
0
06 Jun 2024
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal
  Learning
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
Alex Jinpeng Wang
Linjie Li
Yiqi Lin
Min Li
Lijuan Wang
Mike Zheng Shou
VLM
330
14
0
04 Jun 2024
Extended Mind Transformers
Extended Mind Transformers
Phoebe Klett
Thomas Ahle
RALM
166
0
0
04 Jun 2024
Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs
Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs
Jialiang Xu
Michael Moor
J. Leskovec
260
8
0
29 May 2024
XL3M: A Training-free Framework for LLM Length Extension Based on
  Segment-wise Inference
XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference
Shengnan Wang
Youhui Bai
Lin Zhang
Pingyi Zhou
Shixiong Zhao
Gong Zhang
Sen Wang
Renhai Chen
Hua Xu
Hongwei Sun
405
7
0
28 May 2024
SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language
  Model Itself
SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language Model Itself
Jun Gao
Ziqiang Cao
Wenjie Li
340
11
0
27 May 2024
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
William Brandon
Mayank Mishra
Aniruddha Nrusimha
Yikang Shen
Jonathan Ragan-Kelley
MQ
336
109
0
21 May 2024
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
Ali Modarressi
Abdullatif Köksal
Ayyoob Imani
Mohsen Fayyaz
Hinrich Schütze
KELM
718
30
0
17 Apr 2024
Hierarchical Context Merging: Better Long Context Understanding for
  Pre-trained LLMs
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
Woomin Song
Seunghyuk Oh
Sangwoo Mo
Jaehyung Kim
Sukmin Yun
Jung-Woo Ha
Jinwoo Shin
250
38
0
16 Apr 2024
kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually
  Expanding Large Vocabularies
kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies
Zhongrui Gui
Shuyang Sun
Runjia Li
Jianhao Yuan
Zhaochong An
Karsten Roth
Christian Schroeder de Witt
Juil Sock
VLMCLL
365
23
0
15 Apr 2024
TransformerFAM: Feedback attention is working memory
TransformerFAM: Feedback attention is working memory
Dongseong Hwang
Weiran Wang
Zhuoyuan Huo
K. Sim
P. M. Mengibar
493
17
0
14 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with
  Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRMLLMAGCLL
369
188
0
10 Apr 2024
Streaming Dense Video Captioning
Streaming Dense Video Captioning
Xingyi Zhou
Anurag Arnab
Shyamal Buch
Shen Yan
Austin Myers
Xuehan Xiong
Arsha Nagrani
Cordelia Schmid
VLM
292
87
0
01 Apr 2024
1234
Next
Page 1 of 4