ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,361 papers shown
Optimal ablation for interpretability
Optimal ablation for interpretabilityNeural Information Processing Systems (NeurIPS), 2024
Maximilian Li
Lucas Janson
FAtt
343
11
0
16 Sep 2024
Causal Inference with Large Language Model: A Survey
Causal Inference with Large Language Model: A SurveyNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Jing Ma
CMLLRM
584
22
0
15 Sep 2024
Prevailing Research Areas for Music AI in the Era of Foundation Models
Prevailing Research Areas for Music AI in the Era of Foundation Models
Megan Wei
M. Modrzejewski
Aswin Sivaraman
Dorien Herremans
MedIm
428
3
0
14 Sep 2024
Synthetic continued pretraining
Synthetic continued pretrainingInternational Conference on Learning Representations (ICLR), 2024
Zitong Yang
Neil Band
Shuangping Li
Emmanuel Candès
Tatsunori Hashimoto
CLLSyDa
349
35
0
11 Sep 2024
Rule Extrapolation in Language Models: A Study of Compositional
  Generalization on OOD Prompts
Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts
Anna Mészáros
Szilvia Ujváry
Wieland Brendel
Patrik Reizinger
Ferenc Huszár
256
2
0
09 Sep 2024
OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System
OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System
Xin Xu
Zekun Xi
Yujie Luo
Peng Wang
Bozhong Tian
...
Lei Liang
Qing Cui
Xiaowei Zhu
Jun Zhou
Huajun Chen
KELM
199
8
0
09 Sep 2024
Representational Analysis of Binding in Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Qin Dai
Benjamin Heinzerling
Kentaro Inui
313
0
0
09 Sep 2024
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual
  Knowledge in GPT-2 Small
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small
Maheep Chaudhary
Atticus Geiger
248
28
0
05 Sep 2024
Attend First, Consolidate Later: On the Importance of Attention in
  Different LLM Layers
Attend First, Consolidate Later: On the Importance of Attention in Different LLM LayersBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Amit Ben Artzy
Roy Schwartz
164
24
0
05 Sep 2024
Interpreting and Improving Large Language Models in Arithmetic
  Calculation
Interpreting and Improving Large Language Models in Arithmetic CalculationInternational Conference on Machine Learning (ICML), 2024
Wei Zhang
Chaoqun Wan
Yonggang Zhang
Yiu-ming Cheung
Xinmei Tian
Xu Shen
Jieping Ye
LRM
323
36
0
03 Sep 2024
Does Knowledge Localization Hold True? Surprising Differences Between
  Entity and Relation Perspectives in Language Models
Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language ModelsInternational Conference on Information and Knowledge Management (CIKM), 2024
Yifan Wei
Xiaoyan Yu
Yixuan Weng
Huanhuan Ma
Yuanzhe Zhang
Jun Zhao
Kang Liu
KELM
202
8
0
01 Sep 2024
Modularity in Transformers: Investigating Neuron Separability &
  Specialization
Modularity in Transformers: Investigating Neuron Separability & Specialization
Nicholas Pochinkov
Thomas Jones
Mohammed Rashidur Rahman
175
0
0
30 Aug 2024
Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using
  Prefix-Tuning
Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning
Maxime Méloux
Christophe Cerisara
KELMCLL
256
1
0
30 Aug 2024
How Reliable are Causal Probing Interventions?
How Reliable are Causal Probing Interventions?
Marc E. Canby
Adam Davies
Chirag Rastogi
Anjali Narayan-Chen
343
0
0
28 Aug 2024
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models
Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Xiyu Liu
Zhengxiao Liu
Naibin Gu
Zheng Lin
Wanli Ma
Ji Xiang
Weiping Wang
KELM
424
3
0
27 Aug 2024
Can Transformers Do Enumerative Geometry?
Can Transformers Do Enumerative Geometry?International Conference on Learning Representations (ICLR), 2024
Baran Hashemi
Roderic G. Corominas
Alessandro Giacchetto
904
7
0
27 Aug 2024
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
Yige Li
Hanxun Huang
Yunhan Zhao
Jiabo He
Jun Sun
AAMLSILM
341
19
0
23 Aug 2024
Multilevel Interpretability Of Artificial Neural Networks: Leveraging
  Framework And Methods From Neuroscience
Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience
Zhonghao He
Jascha Achterberg
Katie Collins
Kevin K. Nejad
Danyal Akarca
...
Chole Li
Kai J. Sandbrink
Stephen Casper
Anna Ivanova
Grace W. Lindsay
AI4CE
322
6
0
22 Aug 2024
Enhancing Multi-hop Reasoning through Knowledge Erasure in Large
  Language Model Editing
Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing
Mengqi Zhang
Bowen Fang
Qiang Liu
Sudipta Singha Roy
Shu Wu
Zhumin Chen
Liang Wang
KELM
164
8
0
22 Aug 2024
Defending against Jailbreak through Early Exit Generation of Large Language Models
Defending against Jailbreak through Early Exit Generation of Large Language Models
Chongwen Zhao
Zhihao Dou
Kaizhu Huang
AAML
238
3
0
21 Aug 2024
Personality Alignment of Large Language Models
Personality Alignment of Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Minjun Zhu
Linyi Yang
Yue Zhang
Yue Zhang
ALM
348
21
0
21 Aug 2024
Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge
  Representation Sharing in LLMs
Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs
Maxim Ifergan
Leshem Choshen
Roee Aharoni
Idan Szpektor
Omri Abend
HILM
230
9
0
20 Aug 2024
MEGen: Generative Backdoor into Large Language Models via Model Editing
MEGen: Generative Backdoor into Large Language Models via Model EditingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Jiyang Qiu
Xinbei Ma
Zhuosheng Zhang
Hai Zhao
Yun Li
Qianren Wang
AAMLSILMKELM
274
5
0
20 Aug 2024
KAN 2.0: Kolmogorov-Arnold Networks Meet Science
KAN 2.0: Kolmogorov-Arnold Networks Meet Science
Ziming Liu
Pingchuan Ma
Yixuan Wang
Wojciech Matusik
Max Tegmark
344
159
0
19 Aug 2024
Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit
Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEditAAAI Conference on Artificial Intelligence (AAAI), 2024
Qizhou Chen
Taolin Zhang
Chengyu Wang
Xiaofeng He
Dakan Wang
Tingting Liu
KELM
692
5
0
19 Aug 2024
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRAAAAI Conference on Artificial Intelligence (AAAI), 2024
Jiaang Li
Quan Wang
Zhongnan Wang
Yongdong Zhang
Zhendong Mao
CLLKELM
208
0
0
19 Aug 2024
Activated Parameter Locating via Causal Intervention for Model Merging
Activated Parameter Locating via Causal Intervention for Model Merging
Fanshuang Kong
Richong Zhang
Ziqiao Wang
MoMe
162
3
0
18 Aug 2024
Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference
Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic InferenceAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Geonhee Kim
Marco Valentino
André Freitas
LRMAI4CE
306
12
0
16 Aug 2024
Lower Layers Matter: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused
Lower Layers Matter: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused
Dingwei Chen
Feiteng Fang
Shiwen Ni
Feng Liang
Xiping Hu
A. Argha
Hamid Alinejad-Rokny
Min Yang
Chengming Li
HILM
235
3
0
16 Aug 2024
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Chenhui Hu
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
KELM
377
6
0
14 Aug 2024
Generalisation First, Memorisation Second? Memorisation Localisation for
  Natural Language Classification Tasks
Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Verna Dankers
Ivan Titov
261
9
0
09 Aug 2024
UNLEARN Efficient Removal of Knowledge in Large Language Models
UNLEARN Efficient Removal of Knowledge in Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Tyler Lizzo
Larry Heck
KELMMoMeMU
263
7
0
08 Aug 2024
KnowPO: Knowledge-aware Preference Optimization for Controllable
  Knowledge Selection in Retrieval-Augmented Language Models
KnowPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Ruizhe Zhang
Yongxin Xu
Yuzhen Xiao
Runchuan Zhu
Xinke Jiang
Xu Chu
Junfeng Zhao
Yasha Wang
176
11
0
06 Aug 2024
Unveiling Factual Recall Behaviors of Large Language Models through
  Knowledge Neurons
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge NeuronsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yifei Wang
Yuheng Chen
Wanting Wen
Yu Sheng
Linjing Li
D. Zeng
KELM
326
15
0
06 Aug 2024
The Mechanics of Conceptual Interpretation in GPT Models: Interpretative
  Insights
The Mechanics of Conceptual Interpretation in GPT Models: Interpretative Insights
Nura Aljaafari
Danilo S. Carvalho
André Freitas
KELM
136
3
0
05 Aug 2024
The Quest for the Right Mediator: Surveying Mechanistic Interpretability Through the Lens of Causal Mediation Analysis
The Quest for the Right Mediator: Surveying Mechanistic Interpretability Through the Lens of Causal Mediation AnalysisComputational Linguistics (CL), 2024
Aaron Mueller
Jannik Brinkmann
Millicent Li
Samuel Marks
Koyena Pal
...
Arnab Sen Sharma
Jiuding Sun
Eric Todd
David Bau
Yonatan Belinkov
CML
497
2
0
02 Aug 2024
Revisiting Bi-Encoder Neural Search: An Encoding--Searching Separation Perspective
Revisiting Bi-Encoder Neural Search: An Encoding--Searching Separation Perspective
Danbinaerin Han
Akiko Aizawa
Sihun Lee
211
0
0
02 Aug 2024
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Sangwon Yu
Jongyoon Song
Bongkyu Hwang
Hoyoung Kang
Sooah Cho
Junhwa Choi
Seongho Joe
Taehee Lee
Youngjune Gwon
Sungroh Yoon
587
9
0
31 Jul 2024
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Can LLMs be Fooled? Investigating Vulnerabilities in LLMs
Sara Abdali
Jia He
C. Barberan
Richard Anarfi
295
9
0
30 Jul 2024
Machine Unlearning in Generative AI: A Survey
Machine Unlearning in Generative AI: A Survey
Zheyuan Liu
Guangyao Dou
Zhaoxuan Tan
Yijun Tian
Meng Jiang
MU
327
45
0
30 Jul 2024
Detecting and Understanding Vulnerabilities in Language Models via
  Mechanistic Interpretability
Detecting and Understanding Vulnerabilities in Language Models via Mechanistic InterpretabilityInternational Joint Conference on Artificial Intelligence (IJCAI), 2024
Jorge García-Carrasco
A. Maté
Juan Trujillo
AAML
199
5
0
29 Jul 2024
Can Editing LLMs Inject Harm?
Can Editing LLMs Inject Harm?
Canyu Chen
Baixiang Huang
Zekun Li
Zhaorun Chen
Shiyang Lai
...
Xifeng Yan
William Wang
Juil Sock
Dawn Song
Kai Shu
KELM
408
23
0
29 Jul 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
358
24
0
27 Jul 2024
Demystifying Verbatim Memorization in Large Language Models
Demystifying Verbatim Memorization in Large Language Models
Jing Huang
Diyi Yang
Christopher Potts
ELMPILMMU
306
45
0
25 Jul 2024
Model editing for distribution shifts in uranium oxide morphological
  analysis
Model editing for distribution shifts in uranium oxide morphological analysis
Davis Brown
Cody Nizinski
Madelyn Shapiro
Corey Fallon
Tianzhixi Yin
Henry Kvinge
Jonathan Tu
216
0
0
22 Jul 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
332
60
0
22 Jul 2024
Intrinsic Self-correction for Enhanced Morality: An Analysis of Internal
  Mechanisms and the Superficial Hypothesis
Intrinsic Self-correction for Enhanced Morality: An Analysis of Internal Mechanisms and the Superficial Hypothesis
Guang-Da Liu
Haitao Mao
Shucheng Zhou
K. Johnson
LRM
273
18
0
21 Jul 2024
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Sarah Wiegreffe
Oyvind Tafjord
Yonatan Belinkov
Hanna Hajishirzi
Ashish Sabharwal
218
3
0
21 Jul 2024
LeKUBE: A Legal Knowledge Update BEnchmark
LeKUBE: A Legal Knowledge Update BEnchmark
Changyue Wang
Weihang Su
Yiran Hu
Jiaxin Mao
Yueyue Wu
Cheng Luo
Yiqun Liu
Min Zhang
Shaoping Ma
AILawELM
206
11
0
19 Jul 2024
Investigating the Indirect Object Identification circuit in Mamba
Investigating the Indirect Object Identification circuit in Mamba
Danielle Ensign
Adrià Garriga-Alonso
Mamba
162
0
0
19 Jul 2024
Previous
123...151617...262728
Next