ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.08696
  4. Cited By
Knowledge Neurons in Pretrained Transformers
v1v2 (latest)

Knowledge Neurons in Pretrained Transformers

Annual Meeting of the Association for Computational Linguistics (ACL), 2021
18 April 2021
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
    KELMMU
ArXiv (abs)PDFHTMLGithub (168★)

Papers citing "Knowledge Neurons in Pretrained Transformers"

50 / 410 papers shown
LoFiT: Localized Fine-tuning on LLM Representations
LoFiT: Localized Fine-tuning on LLM Representations
Fangcong Yin
Xi Ye
Greg Durrett
265
41
0
03 Jun 2024
From Feature Visualization to Visual Circuits: Effect of Adversarial
  Model Manipulation
From Feature Visualization to Visual Circuits: Effect of Adversarial Model Manipulation
Géraldin Nanfack
Michael Eickenberg
Eugene Belilovsky
FAttAAMLGNN
311
1
0
03 Jun 2024
Knowledge Graph Tuning: Real-time Large Language Model Personalization
  based on Human Feedback
Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback
Jingwei Sun
Zhixu Du
Yiran Chen
KELM
250
4
0
30 May 2024
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
Renzhi Wang
Piji Li
KELM
284
7
0
29 May 2024
Knowledge Circuits in Pretrained Transformers
Knowledge Circuits in Pretrained Transformers
Yunzhi Yao
Ningyu Zhang
Zekun Xi
Meng Wang
Ziwen Xu
Shumin Deng
Huajun Chen
KELM
436
43
0
28 May 2024
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Dixuan Wang
Yanda Li
Junyuan Jiang
Zepeng Ding
Ziqin Luo
Guochao Jiang
Jiaqing Liang
Deqing Yang
489
34
0
27 May 2024
Perturbation-Restrained Sequential Model Editing
Perturbation-Restrained Sequential Model Editing
Junjie Ma
Hong Wang
Haoyang Xu
Zhen-Hua Ling
Jia-Chen Gu
KELM
510
16
0
27 May 2024
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Jingcheng Deng
Zihao Wei
Liang Pang
Hanxing Ding
Huawei Shen
Xueqi Cheng
KELM
234
2
0
24 May 2024
Linearly Controlled Language Generation with Performative Guarantees
Linearly Controlled Language Generation with Performative Guarantees
Emily Cheng
Marco Baroni
368
13
0
24 May 2024
Implicit In-context Learning
Implicit In-context LearningInternational Conference on Learning Representations (ICLR), 2024
Zhuowei Li
Zihao Xu
Ligong Han
Yunhe Gao
Song Wen
Di Liu
Hao Wang
Dimitris N. Metaxas
357
8
0
23 May 2024
Learnable Privacy Neurons Localization in Language Models
Learnable Privacy Neurons Localization in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruizhe Chen
Tianxiang Hu
Yang Feng
Zuo-Qiang Liu
220
28
0
16 May 2024
Spectral Editing of Activations for Large Language Model Alignment
Spectral Editing of Activations for Large Language Model AlignmentNeural Information Processing Systems (NeurIPS), 2024
Yifu Qiu
Zheng Zhao
Yftah Ziser
Anna Korhonen
Edoardo Ponti
Shay B. Cohen
KELMLLMSV
400
40
0
15 May 2024
Large Language Model Bias Mitigation from the Perspective of Knowledge
  Editing
Large Language Model Bias Mitigation from the Perspective of Knowledge Editing
Ruizhe Chen
Yichen Li
Zikai Xiao
Zuo-Qiang Liu
KELM
333
18
0
15 May 2024
Localizing Task Information for Improved Model Merging and Compression
Localizing Task Information for Improved Model Merging and CompressionInternational Conference on Machine Learning (ICML), 2024
Ke Wang
Nikolaos Dimitriadis
Guillermo Ortiz-Jimenez
Franccois Fleuret
Pascal Frossard
MoMe
287
86
0
13 May 2024
Erasing Concepts from Text-to-Image Diffusion Models with Few-shot
  Unlearning
Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning
Masane Fuchi
Tomohiro Takagi
DiffMVLM
263
25
0
12 May 2024
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Memory-Space Visual Prompting for Efficient Vision-Language Fine-TuningInternational Conference on Machine Learning (ICML), 2024
Shibo Jie
Yehui Tang
Ning Ding
Zhi-Hong Deng
Kai Han
Yunhe Wang
VLM
345
20
0
09 May 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao Song
412
1
0
09 May 2024
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice QuestionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruizhe Li
Yanjun Gao
KELM
338
13
0
06 May 2024
What does the Knowledge Neuron Thesis Have to do with Knowledge?
What does the Knowledge Neuron Thesis Have to do with Knowledge?International Conference on Learning Representations (ICLR), 2024
Jingcheng Niu
Andrew Liu
Zining Zhu
Gerald Penn
327
47
0
03 May 2024
A Human-Computer Collaborative Tool for Training a Single Large Language
  Model Agent into a Network through Few Examples
A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples
Lihang Pan
Yuxuan Li
Chun Yu
Yuanchun Shi
LLMAG
203
2
0
24 Apr 2024
From Matching to Generation: A Survey on Generative Information Retrieval
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
551
135
0
23 Apr 2024
Mechanistic Interpretability for AI Safety -- A Review
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
339
298
0
22 Apr 2024
Decomposing and Editing Predictions by Modeling Model Computation
Decomposing and Editing Predictions by Modeling Model Computation
Harshay Shah
Andrew Ilyas
Aleksander Madry
KELM
290
24
0
17 Apr 2024
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory
Ali Modarressi
Abdullatif Köksal
Ayyoob Imani
Mohsen Fayyaz
Hinrich Schütze
KELM
614
25
0
17 Apr 2024
DESTEIN: Navigating Detoxification of Language Models via Universal
  Steering Pairs and Head-wise Activation Fusion
DESTEIN: Navigating Detoxification of Language Models via Universal Steering Pairs and Head-wise Activation Fusion
Yu Li
Zhihua Wei
Han Jiang
Chuanyang Gong
LLMSV
222
7
0
16 Apr 2024
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and
  Research Agenda
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
Johannes Schneider
260
80
0
15 Apr 2024
Scalable Model Editing via Customized Expert Networks
Scalable Model Editing via Customized Expert Networks
Zihan Yao
Yu He
Tianyu Qi
Ming Li
KELM
232
6
0
03 Apr 2024
Privacy Backdoors: Enhancing Membership Inference through Poisoning
  Pre-trained Models
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Yuxin Wen
Leo Marchyok
Sanghyun Hong
Jonas Geiping
Tom Goldstein
Nicholas Carlini
SILMAAML
275
28
0
01 Apr 2024
The Unreasonable Ineffectiveness of the Deeper Layers
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
428
158
0
26 Mar 2024
Locating and Mitigating Gender Bias in Large Language Models
Locating and Mitigating Gender Bias in Large Language Models
Yuchen Cai
Ding Cao
Rongxi Guo
Yaqin Wen
Guiquan Liu
Enhong Chen
176
10
0
21 Mar 2024
A Unified Framework for Model Editing
A Unified Framework for Model Editing
Akshat Gupta
Dev Sajnani
Gopala Anumanchipalli
KELM
311
52
0
21 Mar 2024
BadEdit: Backdooring large language models by model editing
BadEdit: Backdooring large language models by model editing
Yanzhou Li
Tianlin Li
Kangjie Chen
Jian Zhang
Shangqing Liu
Wenhan Wang
Tianwei Zhang
Yang Liu
SyDaAAMLKELM
230
98
0
20 Mar 2024
Larimar: Large Language Models with Episodic Memory Control
Larimar: Large Language Models with Episodic Memory ControlInternational Conference on Machine Learning (ICML), 2024
Payel Das
Subhajit Chaudhury
Elliot Nelson
Igor Melnyk
Sarath Swaminathan
...
Vijil Chenthamarakshan
Jiří
Jirí Navrátil
Soham Dan
Pin-Yu Chen
CLLKELM
377
32
0
18 Mar 2024
Towards a theory of model distillation
Towards a theory of model distillation
Enric Boix-Adserà
FedMLVLM
235
14
0
14 Mar 2024
VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark
VLKEB: A Large Vision-Language Model Knowledge Editing BenchmarkNeural Information Processing Systems (NeurIPS), 2024
Han Huang
Haitian Zhong
Tao Yu
Qiang Liu
Shu Wu
Liang Wang
Tien-Ping Tan
VLMKELM
267
21
0
12 Mar 2024
In-Context Sharpness as Alerts: An Inner Representation Perspective for
  Hallucination Mitigation
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Shiqi Chen
Miao Xiong
Junteng Liu
Zhengxuan Wu
Teng Xiao
Siyang Gao
Junxian He
HILM
433
39
0
03 Mar 2024
Information Flow Routes: Automatically Interpreting Language Models at
  Scale
Information Flow Routes: Automatically Interpreting Language Models at Scale
Javier Ferrando
Elena Voita
377
71
0
27 Feb 2024
InstructEdit: Instruction-based Knowledge Editing for Large Language
  Models
InstructEdit: Instruction-based Knowledge Editing for Large Language Models
Ningyu Zhang
Bo Tian
Siyuan Cheng
Xiaozhuan Liang
Yi Hu
Kouying Xue
Yanjie Gou
Xi Chen
Huajun Chen
KELM
205
10
0
25 Feb 2024
Interpreting Context Look-ups in Transformers: Investigating
  Attention-MLP Interactions
Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions
Philip Quirke
Shay B. Cohen
Fazl Barez
178
8
0
23 Feb 2024
In-Context Learning of a Linear Transformer Block: Benefits of the MLP
  Component and One-Step GD Initialization
In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization
Ruiqi Zhang
Jingfeng Wu
Peter L. Bartlett
307
29
0
22 Feb 2024
MoELoRA: Contrastive Learning Guided Mixture of Experts on
  Parameter-Efficient Fine-Tuning for Large Language Models
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models
Tongxu Luo
Jiahe Lei
Fangyu Lei
Weihao Liu
Shizhu He
Jun Zhao
Kang Liu
MoEALM
186
42
0
20 Feb 2024
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential
  Memory Editing in Large Language Models
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Zihao Lin
Mohammad Beigi
Hongxuan Li
Jiuxiang Gu
Yuxiang Zhang
Qifan Wang
Wenpeng Yin
Lifu Huang
KELM
166
10
0
16 Feb 2024
Rethinking Machine Unlearning for Large Language Models
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILawMU
428
200
0
13 Feb 2024
Discriminative Adversarial Unlearning
Discriminative Adversarial Unlearning
Rohan Sharma
Shijie Zhou
Kaiyi Ji
Changyou Chen
MU
169
1
0
10 Feb 2024
Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization
  for Enhanced Time Series Forecasting
Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting
Yanjun Zhao
Tian Zhou
Chao Chen
Liang Sun
Yi Qian
Rong Jin
AI4TS
162
4
0
08 Feb 2024
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for
  Transformers
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Reduan Achtibat
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Aakriti Jain
Thomas Wiegand
Sebastian Lapuschkin
Wojciech Samek
342
80
0
08 Feb 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank
  Modifications
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei
Kaixuan Huang
Yangsibo Huang
Tinghao Xie
Xiangyu Qi
Mengzhou Xia
Prateek Mittal
Mengdi Wang
Peter Henderson
AAML
312
174
0
07 Feb 2024
Exploring higher-order neural network node interactions with total
  correlation
Exploring higher-order neural network node interactions with total correlation
Thomas Kerby
Teresa White
Kevin Moon
136
0
0
06 Feb 2024
Neighboring Perturbations of Knowledge Editing on Large Language Models
Neighboring Perturbations of Knowledge Editing on Large Language Models
Jun-Yu Ma
Zhen-Hua Ling
Ningyu Zhang
Jia-Chen Gu
KELM
195
6
0
31 Jan 2024
Propagation and Pitfalls: Reasoning-based Assessment of Knowledge
  Editing through Counterfactual Tasks
Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks
Wenyue Hua
Jiang Guo
Mingwen Dong
He Zhu
Patrick Ng
Zhiguo Wang
KELM
318
24
0
31 Jan 2024
Previous
123456789
Next