ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.14913
  4. Cited By
Transformer Feed-Forward Layers Are Key-Value Memories

Transformer Feed-Forward Layers Are Key-Value Memories

29 December 2020
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
    KELM
ArXivPDFHTML

Papers citing "Transformer Feed-Forward Layers Are Key-Value Memories"

50 / 128 papers shown
Title
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models
Junfeng Fang
Houcheng Jiang
Kun Wang
Yunshan Ma
Shi Jie
Xiangnan He
Tat-Seng Chua
Tat-seng Chua
KELM
35
33
0
03 Oct 2024
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim
Hyunji Lee
Hyowon Cho
Joel Jang
Hyeonbin Hwang
Seungpil Won
Youbin Ahn
Dohaeng Lee
Minjoon Seo
KELM
87
3
0
02 Oct 2024
Interpreting Arithmetic Mechanism in Large Language Models through
  Comparative Neuron Analysis
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Zeping Yu
Sophia Ananiadou
LRM
MILM
27
6
0
21 Sep 2024
Extracting Paragraphs from LLM Token Activations
Extracting Paragraphs from LLM Token Activations
Nicholas Pochinkov
Angelo Benoit
Lovkush Agarwal
Zainab Ali Majid
Lucile Ter-Minassian
30
1
0
10 Sep 2024
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Wei Chen
Zhen Huang
Liang Xie
Binbin Lin
Houqiang Li
...
Deng Cai
Yonggang Zhang
Wenxiao Wang
Xu Shen
Jieping Ye
49
6
0
03 Sep 2024
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models
Chenhui Hu
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
KELM
59
2
0
14 Aug 2024
Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions
Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions
Bo Xue
Yi Xu
Yunchong Song
Yiming Pang
Yuyang Ren
Jiaxin Ding
Luoyi Fu
Xinbing Wang
OffRL
49
1
0
13 Aug 2024
Layerwise Recurrent Router for Mixture-of-Experts
Layerwise Recurrent Router for Mixture-of-Experts
Zihan Qiu
Zeyu Huang
Shuang Cheng
Yizhi Zhou
Zili Wang
Ivan Titov
Jie Fu
MoE
73
2
0
13 Aug 2024
To Forget or Not? Towards Practical Knowledge Unlearning for Large
  Language Models
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
Bozhong Tian
Xiaozhuan Liang
Siyuan Cheng
Qingbin Liu
Mengru Wang
Dianbo Sui
Xi Chen
Huajun Chen
Ningyu Zhang
MU
27
6
0
02 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
79
19
0
02 Jul 2024
Memorizing Documents with Guidance in Large Language Models
Memorizing Documents with Guidance in Large Language Models
Bumjin Park
Jaesik Choi
KELM
RALM
36
1
0
23 Jun 2024
Beyond Individual Facts: Investigating Categorical Knowledge Locality of
  Taxonomy and Meronomy Concepts in GPT Models
Beyond Individual Facts: Investigating Categorical Knowledge Locality of Taxonomy and Meronomy Concepts in GPT Models
Christopher Burger
Yifan Hu
Thai Le
KELM
36
0
0
22 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During
  Pretraining?
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
37
30
0
17 Jun 2024
MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked
  Low-Rank Adaptation
MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation
Jiakuan Xie
Pengfei Cao
Yuheng Chen
Yubo Chen
Kang Liu
Jun Zhao
KELM
34
3
0
17 Jun 2024
In-Context Editing: Learning Knowledge from Self-Induced Distributions
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Siyuan Qi
Bangcheng Yang
Kailin Jiang
Xiaobo Wang
Jiaqi Li
Yifan Zhong
Yaodong Yang
Zilong Zheng
KELM
101
8
0
17 Jun 2024
DIEKAE: Difference Injection for Efficient Knowledge Augmentation and
  Editing of Large Language Models
DIEKAE: Difference Injection for Efficient Knowledge Augmentation and Editing of Large Language Models
Alessio Galatolo
Meriem Beloucif
Katie Winkle
33
0
0
15 Jun 2024
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Tomer Ashuach
Martin Tutek
Yonatan Belinkov
KELM
MU
63
4
0
13 Jun 2024
Interpreting the Second-Order Effects of Neurons in CLIP
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
54
16
0
06 Jun 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
42
3
0
28 May 2024
Knowledge Circuits in Pretrained Transformers
Knowledge Circuits in Pretrained Transformers
Yunzhi Yao
Ningyu Zhang
Zekun Xi
Meng Wang
Ziwen Xu
Shumin Deng
Huajun Chen
KELM
64
20
0
28 May 2024
Perturbation-Restrained Sequential Model Editing
Perturbation-Restrained Sequential Model Editing
Junjie Ma
Hong Wang
Haoyang Xu
Zhen-Hua Ling
Jia-Chen Gu
KELM
59
8
0
27 May 2024
Large Scale Knowledge Washing
Large Scale Knowledge Washing
Yu-Xiang Wang
Ruihan Wu
Zexue He
X. Chen
Julian McAuley
MU
KELM
75
5
0
26 May 2024
Sparse Matrix in Large Language Model Fine-tuning
Sparse Matrix in Large Language Model Fine-tuning
Haoze He
Juncheng Billy Li
Xuan Jiang
Heather Miller
MoE
19
3
0
24 May 2024
Implicit In-context Learning
Implicit In-context Learning
Zhuowei Li
Zihao Xu
Ligong Han
Yunhe Gao
Song Wen
Di Liu
Hao Wang
Dimitris N. Metaxas
38
1
0
23 May 2024
Beyond Scaling Laws: Understanding Transformer Performance with
  Associative Memory
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Xueyan Niu
Bo Bai
Lei Deng
Wei Han
31
6
0
14 May 2024
What does the Knowledge Neuron Thesis Have to do with Knowledge?
What does the Knowledge Neuron Thesis Have to do with Knowledge?
Jingcheng Niu
Andrew Liu
Zining Zhu
Gerald Penn
41
30
0
03 May 2024
EfficientASR: Speech Recognition Network Compression via Attention
  Redundancy and Chunk-Level FFN Optimization
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
Jianzong Wang
Ziqi Liang
Xulong Zhang
Ning Cheng
Jing Xiao
30
0
0
30 Apr 2024
CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with
  Fine-tuned Large Language Model
CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with Fine-tuned Large Language Model
Zhengpeng Shi
Haoran Luo
LRM
ALM
36
2
0
28 Apr 2024
Mixture of Low-rank Experts for Transferable AI-Generated Image
  Detection
Mixture of Low-rank Experts for Transferable AI-Generated Image Detection
Zihan Liu
Hanyi Wang
Yaoyu Kang
Shilin Wang
MoE
36
12
0
07 Apr 2024
Dissecting Query-Key Interaction in Vision Transformers
Dissecting Query-Key Interaction in Vision Transformers
Xu Pan
Aaron Philip
Ziqian Xie
Odelia Schwartz
32
1
0
04 Apr 2024
Personalized LLM Response Generation with Parameterized Memory Injection
Personalized LLM Response Generation with Parameterized Memory Injection
Kai Zhang
Lizhi Qing
Yangyang Kang
34
11
0
04 Apr 2024
Tracing the Roots of Facts in Multilingual Language Models: Independent,
  Shared, and Transferred Knowledge
Tracing the Roots of Facts in Multilingual Language Models: Independent, Shared, and Transferred Knowledge
Xin Zhao
Naoki Yoshinaga
Daisuke Oba
KELM
HILM
27
10
0
08 Mar 2024
Data-free Weight Compress and Denoise for Large Language Models
Data-free Weight Compress and Denoise for Large Language Models
Runyu Peng
Yunhua Zhou
Qipeng Guo
Yang Gao
Hang Yan
Xipeng Qiu
Dahua Lin
39
1
0
26 Feb 2024
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate
  Knowledge Neurons in Large Language Models
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models
Yuheng Chen
Pengfei Cao
Yubo Chen
Yining Wang
Shengping Liu
Kang Liu
Jun Zhao
KELM
35
1
0
21 Feb 2024
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering
Xiaopeng Li
Shasha Li
Shezheng Song
Huijun Liu
Bing Ji
...
Jun Ma
Jie Yu
Xiaodong Liu
Jing Wang
Weimin Zhang
KELM
37
4
0
31 Jan 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Black-Box Access is Insufficient for Rigorous AI Audits
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
17
76
0
25 Jan 2024
The Truth is in There: Improving Reasoning in Language Models with
  Layer-Selective Rank Reduction
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Pratyusha Sharma
Jordan T. Ash
Dipendra Kumar Misra
LRM
15
78
0
21 Dec 2023
MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA
MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA
Lang Yu
Qin Chen
Jie Zhou
Liang He
KELM
15
45
0
19 Dec 2023
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
Giovanni Monea
Maxime Peyrard
Martin Josifoski
Vishrav Chaudhary
Jason Eisner
Emre Kiciman
Hamid Palangi
Barun Patra
Robert West
KELM
51
12
0
04 Dec 2023
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings
Andrea W Wen-Yi
David Mimno
25
14
0
29 Nov 2023
Assessing Knowledge Editing in Language Models via Relation Perspective
Assessing Knowledge Editing in Language Models via Relation Perspective
Yifan Wei
Xiaoyan Yu
Huanhuan Ma
Fangyu Lei
Yixuan Weng
Ran Song
Kang Liu
KELM
28
15
0
15 Nov 2023
Identifying Linear Relational Concepts in Large Language Models
Identifying Linear Relational Concepts in Large Language Models
David Chanin
Anthony Hunter
Oana-Maria Camburu
LLMSV
KELM
16
4
0
15 Nov 2023
Codebook Features: Sparse and Discrete Interpretability for Neural
  Networks
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
27
27
0
26 Oct 2023
Investigating semantic subspaces of Transformer sentence embeddings
  through linear structural probing
Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing
Dmitry Nikolaev
Sebastian Padó
46
5
0
18 Oct 2023
ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question
  Answering with Fine-tuned Large Language Models
ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models
Haoran Luo
E. Haihong
Zichen Tang
Shiyao Peng
Yikai Guo
...
Guanting Dong
Meina Song
Wei Lin
Yifan Zhu
Luu Anh Tuan
RALM
26
37
0
13 Oct 2023
Towards Best Practices of Activation Patching in Language Models:
  Metrics and Methods
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
28
97
0
27 Sep 2023
Knowledge Sanitization of Large Language Models
Knowledge Sanitization of Large Language Models
Yoichi Ishibashi
Hidetoshi Shimodaira
KELM
29
19
0
21 Sep 2023
PMET: Precise Model Editing in a Transformer
PMET: Precise Model Editing in a Transformer
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
26
115
0
17 Aug 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
37
155
0
24 Jul 2023
SelfEvolve: A Code Evolution Framework via Large Language Models
SelfEvolve: A Code Evolution Framework via Large Language Models
Shuyang Jiang
Yuhao Wang
Yu Wang
16
32
0
05 Jun 2023
Previous
123
Next