ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,361 papers shown
Scaling Laws for Associative Memories
Scaling Laws for Associative MemoriesInternational Conference on Learning Representations (ICLR), 2023
Vivien A. Cabannes
Elvis Dohmatob
A. Bietti
352
25
0
04 Oct 2023
Can Language Models be Instructed to Protect Personal Information?
Can Language Models be Instructed to Protect Personal Information?
Yang Chen
Ethan Mendes
Sauvik Das
Wei Xu
Alan Ritter
PILM
198
47
0
03 Oct 2023
Language Models Represent Space and Time
Language Models Represent Space and TimeInternational Conference on Learning Representations (ICLR), 2023
Wes Gurnee
Max Tegmark
533
233
0
03 Oct 2023
Editing Personality for Large Language Models
Editing Personality for Large Language ModelsNatural Language Processing and Chinese Computing (NLPCC), 2023
Shengyu Mao
Xiaohan Wang
Meng Wang
Yong Jiang
Pengjun Xie
Yan Zhang
Ningyu Zhang
KELM
403
16
0
03 Oct 2023
Modularity in Deep Learning: A Survey
Modularity in Deep Learning: A Survey
Haozhe Sun
Isabelle Guyon
MoMe
314
7
0
02 Oct 2023
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large
  Language Models
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models
Duanyu Feng
Yongfu Dai
Jimin Huang
Yifang Zhang
Qianqian Xie
Weiguang Han
Zhengyu Chen
Alejandro Lopez-Lira
Hao Wang
265
19
0
01 Oct 2023
From Language Modeling to Instruction Following: Understanding the
  Behavior Shift in LLMs after Instruction Tuning
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction TuningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Xuansheng Wu
Wenlin Yao
Jianshu Chen
Xiaoman Pan
Xiaoyang Wang
Ninghao Liu
Dong Yu
LRM
275
49
0
30 Sep 2023
RelBERT: Embedding Relations with Language Models
RelBERT: Embedding Relations with Language ModelsArtificial Intelligence (AIJ), 2023
Asahi Ushio
Jose Camacho-Collados
Steven Schockaert
KELM
324
3
0
30 Sep 2023
Medical Foundation Models are Susceptible to Targeted Misinformation
  Attacks
Medical Foundation Models are Susceptible to Targeted Misinformation Attacks
T. Han
S. Nebelung
Firas Khader
Tian Wang
Gustav Mueller-Franzes
...
Jens Kleesiek
Christoph Haarburger
Keno K. Bressem
Jakob Nikolas Kather
Daniel Truhn
AAML
95
7
0
29 Sep 2023
KLoB: a Benchmark for Assessing Knowledge Locating Methods in Language
  Models
KLoB: a Benchmark for Assessing Knowledge Locating Methods in Language Models
Yiming Ju
Zheng Zhang
KELM
168
9
0
28 Sep 2023
Towards Best Practices of Activation Patching in Language Models:
  Metrics and Methods
Towards Best Practices of Activation Patching in Language Models: Metrics and MethodsInternational Conference on Learning Representations (ICLR), 2023
Fred Zhang
Neel Nanda
LLMSV
531
175
0
27 Sep 2023
MKRAG: Medical Knowledge Retrieval Augmented Generation for Medical
  Question Answering
MKRAG: Medical Knowledge Retrieval Augmented Generation for Medical Question Answering
Takuya Higuchi
Shaochen Xu
Avamarie Brueggeman
Zheng Liu
Tianming Liu
Xiang Li
Ninghao Liu
RALM
232
34
0
27 Sep 2023
Targeted Image Data Augmentation Increases Basic Skills Captioning
  Robustness
Targeted Image Data Augmentation Increases Basic Skills Captioning RobustnessIEEE Games Entertainment Media Conference (IEEE GEM), 2023
Valentin Barriere
Felipe del Rio
Andres Carvallo De Ferari
Carlos Aspillaga
Eugenio Herrera-Berg
Cristian Buc Calderon
DiffM
233
0
0
27 Sep 2023
Identifying and Mitigating Privacy Risks Stemming from Language Models:
  A Survey
Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey
Victoria Smith
Ali Shahin Shamsabadi
Carolyn Ashurst
Adrian Weller
PILM
503
41
0
27 Sep 2023
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of
  Language Models
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Mert Yuksekgonul
Varun Chandrasekaran
Erik Jones
Suriya Gunasekar
Ranjita Naik
Hamid Palangi
Ece Kamar
Besmira Nushi
HILM
213
68
0
26 Sep 2023
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking
  Unrelated Questions
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
Lorenzo Pacchiardi
A. J. Chan
Sören Mindermann
Ilan Moscovitz
Alexa Y. Pan
Y. Gal
Owain Evans
J. Brauner
LLMAGHILM
255
79
0
26 Sep 2023
Large Language Model Alignment: A Survey
Large Language Model Alignment: A Survey
Shangda Wu
Renren Jin
Yufei Huang
Chuang Liu
Weilong Dong
Zishan Guo
Xinwei Wu
Yan Liu
Deyi Xiong
LM&MA
363
287
0
26 Sep 2023
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Physics of Language Models: Part 3.1, Knowledge Storage and ExtractionInternational Conference on Machine Learning (ICML), 2023
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
546
238
0
25 Sep 2023
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems
Leonardo Ranaldi
Fabio Massimo Zanzotto
235
7
0
21 Sep 2023
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"International Conference on Learning Representations (ICLR), 2023
Lukas Berglund
Meg Tong
Max Kaufmann
Mikita Balesni
Asa Cooper Stickland
Tomasz Korbak
Owain Evans
LRM
512
399
0
21 Sep 2023
Knowledge Sanitization of Large Language Models
Knowledge Sanitization of Large Language Models
Yoichi Ishibashi
Hidetoshi Shimodaira
KELM
285
37
0
21 Sep 2023
Rigorously Assessing Natural Language Explanations of Neurons
Rigorously Assessing Natural Language Explanations of NeuronsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Jing-ling Huang
Atticus Geiger
Karel DÓosterlinck
Zhengxuan Wu
Christopher Potts
MILM
241
40
0
19 Sep 2023
Cross-Lingual Knowledge Editing in Large Language Models
Cross-Lingual Knowledge Editing in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiaan Wang
Yunlong Liang
Zengkui Sun
Yu Cao
Jiarong Xu
Fandong Meng
KELM
226
17
0
16 Sep 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM
  Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao Song
Weixin Wang
Junze Yin
298
29
0
14 Sep 2023
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and
  Simplicity Bias in MLMs
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMsInternational Conference on Learning Representations (ICLR), 2023
Angelica Chen
Ravid Schwartz-Ziv
Dong Wang
Matthew L. Leavitt
Naomi Saphra
577
105
0
13 Sep 2023
Circuit Breaking: Removing Model Behaviors with Targeted Ablation
Circuit Breaking: Removing Model Behaviors with Targeted Ablation
Maximilian Li
Xander Davies
Max Nadeau
KELMMU
306
34
0
12 Sep 2023
Memory Injections: Correcting Multi-Hop Reasoning Failures during
  Inference in Transformer-Based Language Models
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language ModelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Mansi Sakarvadia
Aswathy Ajith
Arham Khan
Daniel Grzenda
Nathaniel Hudson
André Bauer
Kyle Chard
Ian Foster
KELMLRM
236
23
0
11 Sep 2023
Neurons in Large Language Models: Dead, N-gram, Positional
Neurons in Large Language Models: Dead, N-gram, PositionalAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Elena Voita
Javier Ferrando
Christoforos Nalmpantis
MILM
400
73
0
09 Sep 2023
FIND: A Function Description Benchmark for Evaluating Interpretability
  Methods
FIND: A Function Description Benchmark for Evaluating Interpretability MethodsNeural Information Processing Systems (NeurIPS), 2023
Sarah Schwettmann
Tamar Rott Shaham
Joanna Materzyñska
Neil Chowdhury
Shuang Li
Jacob Andreas
David Bau
Antonio Torralba
262
31
0
07 Sep 2023
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
  Language Models
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Yung-Sung Chuang
Yujia Xie
Hongyin Luo
Yoon Kim
James R. Glass
Pengcheng He
HILM
287
288
0
07 Sep 2023
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language ModelsComputational Linguistics (CL), 2023
Yue Zhang
Yafu Li
Leyang Cui
Deng Cai
Lemao Liu
...
Longyue Wang
Anh Tuan Luu
Freda Shi
Shuming Shi
Shuming Shi
LRMRALMHILM
733
828
0
03 Sep 2023
Explainability for Large Language Models: A Survey
Explainability for Large Language Models: A SurveyACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
D. Yin
Jundong Li
LRM
500
710
0
02 Sep 2023
Emergent Linear Representations in World Models of Self-Supervised
  Sequence Models
Emergent Linear Representations in World Models of Self-Supervised Sequence ModelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Neel Nanda
Andrew Lee
Martin Wattenberg
FAttMILM
316
249
0
02 Sep 2023
Why do universal adversarial attacks work on large language models?:
  Geometry might be the answer
Why do universal adversarial attacks work on large language models?: Geometry might be the answer
Varshini Subhash
Anna Bialas
Weiwei Pan
Finale Doshi-Velez
AAML
215
16
0
01 Sep 2023
Towards Vision-Language Mechanistic Interpretability: A Causal Tracing
  Tool for BLIP
Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP
Vedant Palit
Rohan Pandey
Aryaman Arora
Paul Pu Liang
275
46
0
27 Aug 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey
AI-Generated Content (AIGC) for Various Data Modalities: A SurveyACM Computing Surveys (ACM Comput. Surv.), 2023
Lin Geng Foo
Hossein Rahmani
Jing Liu
770
49
0
27 Aug 2023
Unified Concept Editing in Diffusion Models
Unified Concept Editing in Diffusion ModelsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Rohit Gandikota
Hadas Orgad
Yonatan Belinkov
Joanna Materzyñska
David Bau
DiffM
380
303
0
25 Aug 2023
Journey to the Center of the Knowledge Neurons: Discoveries of
  Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge NeuronsAAAI Conference on Artificial Intelligence (AAAI), 2023
Yuheng Chen
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
KELM
334
59
0
25 Aug 2023
Overcoming Generic Knowledge Loss with Selective Parameter Update
Overcoming Generic Knowledge Loss with Selective Parameter UpdateComputer Vision and Pattern Recognition (CVPR), 2023
Wenxuan Zhang
Paul Janson
Rahaf Aljundi
Mohamed Elhoseiny
KELMCLL
380
20
0
23 Aug 2023
Mode Combinability: Exploring Convex Combinations of Permutation Aligned
  Models
Mode Combinability: Exploring Convex Combinations of Permutation Aligned ModelsNeural Networks (Neural Netw.), 2023
Adrián Csiszárik
M. Kiss
Péter Korösi-Szabó
Márton Muntag
Gergely Papp
D. Varga
MoMe
183
1
0
22 Aug 2023
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
GradientCoin: A Peer-to-Peer Decentralized Large Language Models
Yeqi Gao
Zhao Song
Junze Yin
179
22
0
21 Aug 2023
DocTER: Evaluating Document-based Knowledge Editing
DocTER: Evaluating Document-based Knowledge EditingInformation Processing & Management (IPM), 2023
Suhang Wu
Minlong Peng
Minlong Peng
Y. Lin
Wenbo Li
Mingming Sun
Jinsong Su
KELM
302
39
0
19 Aug 2023
Linearity of Relation Decoding in Transformer Language Models
Linearity of Relation Decoding in Transformer Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Evan Hernandez
Arnab Sen Sharma
Tal Haklay
Kevin Meng
Martin Wattenberg
Jacob Andreas
Yonatan Belinkov
David Bau
KELM
335
140
0
17 Aug 2023
PMET: Precise Model Editing in a Transformer
PMET: Precise Model Editing in a TransformerAAAI Conference on Artificial Intelligence (AAAI), 2023
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
537
183
0
17 Aug 2023
Separate the Wheat from the Chaff: Model Deficiency Unlearning via
  Parameter-Efficient Module Operation
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module OperationAAAI Conference on Artificial Intelligence (AAAI), 2023
Xinshuo Hu
Dongfang Li
Baotian Hu
Zihao Zheng
Zhenyu Liu
Hao Fei
KELMMU
211
39
0
16 Aug 2023
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language
  Models
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Peng Wang
Ningyu Zhang
Bo Tian
Zekun Xi
Yunzhi Yao
...
Shuyang Cheng
Kangwei Liu
Yuansheng Ni
Guozhou Zheng
Huajun Chen
KELM
258
82
0
14 Aug 2023
Explaining Relation Classification Models with Semantic Extents
Explaining Relation Classification Models with Semantic Extents
Lars Klöser
André Büsgen
Philipp Kohl
Bodo Kraft
Albert Zündorf
123
1
0
04 Aug 2023
Multimodal Neurons in Pretrained Text-Only Transformers
Multimodal Neurons in Pretrained Text-Only Transformers
Sarah Schwettmann
Neil Chowdhury
Samuel J. Klein
David Bau
Antonio Torralba
MILM
272
43
0
03 Aug 2023
Dual Governance: The intersection of centralized regulation and
  crowdsourced safety mechanisms for Generative AI
Dual Governance: The intersection of centralized regulation and crowdsourced safety mechanisms for Generative AI
Avijit Ghosh
D. Lakshmi
88
7
0
02 Aug 2023
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge
  using Vision-Language Pre-Training Model
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training ModelACM Multimedia (ACM MM), 2023
Ka Leong Cheng
Wenpo Song
Zheng Ma
Wenhao Zhu
Zi-Yue Zhu
Jianbing Zhang
CLIPVLM
178
18
0
02 Aug 2023
Previous
123...2425262728
Next
Page 25 of 28
Pageof 28