ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,361 papers shown
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Amélie Reymond
LRM
320
8
0
21 May 2023
Decouple knowledge from parameters for plug-and-play language modeling
Decouple knowledge from parameters for plug-and-play language modeling
Xin Cheng
Yankai Lin
Preslav Nakov
Dongyan Zhao
Rui Yan
KELM
227
2
0
19 May 2023
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
Interpretability at Scale: Identifying Causal Mechanisms in AlpacaNeural Information Processing Systems (NeurIPS), 2023
Zhengxuan Wu
Atticus Geiger
Thomas Icard
Christopher Potts
Noah D. Goodman
MILM
434
108
0
15 May 2023
Semantic Composition in Visually Grounded Language Models
Semantic Composition in Visually Grounded Language Models
Rohan Pandey
CoGe
201
1
0
15 May 2023
FactKB: Generalizable Factuality Evaluation using Language Models
  Enhanced with Factual Knowledge
FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual KnowledgeConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shangbin Feng
Vidhisha Balachandran
Yuyang Bai
Yulia Tsvetkov
KELMHILM
308
62
0
14 May 2023
RECKONING: Reasoning through Dynamic Knowledge Encoding
RECKONING: Reasoning through Dynamic Knowledge EncodingNeural Information Processing Systems (NeurIPS), 2023
Zeming Chen
Gail Weiss
E. Mitchell
Asli Celikyilmaz
Antoine Bosselut
KELMLRM
349
15
0
10 May 2023
Coherent Wave Dynamics and Language Generation of a Generative
  Pre-trained Transformer
Coherent Wave Dynamics and Language Generation of a Generative Pre-trained Transformer
Tao Hong
64
1
0
08 May 2023
Chain-of-Skills: A Configurable Model for Open-domain Question Answering
Chain-of-Skills: A Configurable Model for Open-domain Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Kaixin Ma
Hao Cheng
Yu Zhang
Xiaodong Liu
Eric Nyberg
Jianfeng Gao
LRM
189
20
0
04 May 2023
ReMask: A Robust Information-Masking Approach for Domain Counterfactual
  Generation
ReMask: A Robust Information-Masking Approach for Domain Counterfactual GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Pengfei Hong
Rishabh Bhardwaj
Navonil Majumdar
Somak Aditya
Soujanya Poria
AAML
151
1
0
04 May 2023
Can LMs Learn New Entities from Descriptions? Challenges in Propagating
  Injected Knowledge
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected KnowledgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yasumasa Onoe
Michael J.Q. Zhang
Shankar Padmanabhan
Greg Durrett
Eunsol Choi
KELM
421
84
0
02 May 2023
Key-Locked Rank One Editing for Text-to-Image Personalization
Key-Locked Rank One Editing for Text-to-Image PersonalizationInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2023
Yoad Tewel
Rinon Gal
Gal Chechik
Yuval Atzmon
DiffM
425
217
0
02 May 2023
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Wes Gurnee
Neel Nanda
Matthew Pauly
Katherine Harvey
Dmitrii Troitskii
Dimitris Bertsimas
MILM
520
286
0
02 May 2023
How does GPT-2 compute greater-than?: Interpreting mathematical
  abilities in a pre-trained language model
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language modelNeural Information Processing Systems (NeurIPS), 2023
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
1.0K
179
0
30 Apr 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability
Towards Automated Circuit Discovery for Mechanistic InterpretabilityNeural Information Processing Systems (NeurIPS), 2023
Arthur Conmy
Augustine N. Mavor-Parker
Aengus Lynch
Stefan Heimersheim
Adrià Garriga-Alonso
531
442
0
28 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language
  Models
Dissecting Recall of Factual Associations in Auto-Regressive Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
724
418
0
28 Apr 2023
Label-Free Concept Bottleneck Models
Label-Free Concept Bottleneck ModelsInternational Conference on Learning Representations (ICLR), 2023
Tuomas P. Oikarinen
Subhro Das
Lam M. Nguyen
Tsui-Wei Weng
332
239
0
12 Apr 2023
Localizing Model Behavior with Path Patching
Localizing Model Behavior with Path Patching
Nicholas W. Goldowsky-Dill
Chris MacLeod
L. Sato
Aryaman Arora
489
122
0
12 Apr 2023
Inspecting and Editing Knowledge Representations in Language Models
Inspecting and Editing Knowledge Representations in Language Models
Evan Hernandez
Belinda Z. Li
Jacob Andreas
KELM
300
123
0
03 Apr 2023
Ablating Concepts in Text-to-Image Diffusion Models
Ablating Concepts in Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Nupur Kumari
Bin Zhang
Sheng-Yu Wang
Eli Shechtman
Richard Y. Zhang
Jun-Yan Zhu
VLM
478
282
0
23 Mar 2023
Language Model Behavior: A Comprehensive Survey
Language Model Behavior: A Comprehensive SurveyInternational Conference on Computational Logic (ICCL), 2023
Tyler A. Chang
Benjamin Bergen
VLMLRMLM&MA
372
139
0
20 Mar 2023
Context-faithful Prompting for Large Language Models
Context-faithful Prompting for Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Wenxuan Zhou
Sheng Zhang
Hoifung Poon
Muhao Chen
KELM
240
80
0
20 Mar 2023
Editing Implicit Assumptions in Text-to-Image Diffusion Models
Editing Implicit Assumptions in Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Hadas Orgad
Bahjat Kawar
Yonatan Belinkov
DiffM
363
115
0
14 Mar 2023
Erasing Concepts from Diffusion Models
Erasing Concepts from Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Rohit Gandikota
Joanna Materzyñska
Jaden Fiotto-Kaufman
David Bau
DiffM
494
431
0
13 Mar 2023
Making a Computational Attorney
Making a Computational AttorneySDM (SDM), 2023
Dell Zhang
Frank Schilder
Jack G. Conrad
Masoud Makrehchi
David von Rickenbach
Isabelle Moulinier
165
1
0
07 Mar 2023
Finding Alignments Between Interpretable Causal Variables and
  Distributed Neural Representations
Finding Alignments Between Interpretable Causal Variables and Distributed Neural RepresentationsCLEaR (CLEaR), 2023
Atticus Geiger
Zhengxuan Wu
Christopher Potts
Thomas Icard
Noah D. Goodman
CML
499
139
0
05 Mar 2023
Competence-Based Analysis of Language Models
Competence-Based Analysis of Language Models
Adam Davies
Jize Jiang
Chengxiang Zhai
ELM
357
7
0
01 Mar 2023
Edit at your own risk: evaluating the robustness of edited models to
  distribution shifts
Edit at your own risk: evaluating the robustness of edited models to distribution shifts
Davis Brown
Charles Godfrey
Cody Nizinski
Jonathan Tu
Henry Kvinge
KELM
246
8
0
28 Feb 2023
Inseq: An Interpretability Toolkit for Sequence Generation Models
Inseq: An Interpretability Toolkit for Sequence Generation ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Gabriele Sarti
Nils Feldhus
Ludwig Sickert
Oskar van der Wal
Malvina Nissim
Arianna Bisazza
317
90
0
27 Feb 2023
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models
Analyzing And Editing Inner Mechanisms Of Backdoored Language ModelsConference on Fairness, Accountability and Transparency (FAccT), 2023
Max Lamparth
Anka Reuel
KELM
213
15
0
24 Feb 2023
Task-Specific Skill Localization in Fine-tuned Language Models
Task-Specific Skill Localization in Fine-tuned Language ModelsInternational Conference on Machine Learning (ICML), 2023
A. Panigrahi
Nikunj Saunshi
Haoyu Zhao
Sanjeev Arora
MoMe
322
89
0
13 Feb 2023
What Matters In The Structured Pruning of Generative Language Models?
What Matters In The Structured Pruning of Generative Language Models?
Michael Santacroce
Zixin Wen
Yelong Shen
Yuan-Fang Li
178
36
0
07 Feb 2023
Effective Data Augmentation With Diffusion Models
Effective Data Augmentation With Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2023
Brandon Trabucco
Kyle Doherty
Max Gurinas
Ruslan Salakhutdinov
DiffMVLM
472
336
0
07 Feb 2023
Analyzing Feed-Forward Blocks in Transformers through the Lens of
  Attention Maps
Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention MapsInternational Conference on Learning Representations (ICLR), 2023
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
463
25
0
01 Feb 2023
Do Multi-Document Summarization Models Synthesize?
Do Multi-Document Summarization Models Synthesize?Transactions of the Association for Computational Linguistics (TACL), 2023
Jay DeYoung
Stephanie C. Martinez
Iain J. Marshall
Byron C. Wallace
287
14
0
31 Jan 2023
Truth Machines: Synthesizing Veracity in AI Language Models
Truth Machines: Synthesizing Veracity in AI Language ModelsAi & Society (AI & Society), 2023
Luke Munn
Liam Magee
Vanicka Arora
SyDaHILM
96
50
0
28 Jan 2023
Tracr: Compiled Transformers as a Laboratory for Interpretability
Tracr: Compiled Transformers as a Laboratory for InterpretabilityNeural Information Processing Systems (NeurIPS), 2023
David Lindner
János Kramár
Sebastian Farquhar
Matthew Rahtz
Tom McGrath
Vladimir Mikulik
494
87
0
12 Jan 2023
Can Large Language Models Change User Preference Adversarially?
Can Large Language Models Change User Preference Adversarially?
Varshini Subhash
AAML
180
9
0
05 Jan 2023
A Survey on Knowledge-Enhanced Pre-trained Language Models
A Survey on Knowledge-Enhanced Pre-trained Language Models
Chaoqi Zhen
Yanlei Shang
Xiangyu Liu
Yifei Li
Yong Chen
Dell Zhang
VLMKELM
238
3
0
27 Dec 2022
DialGuide: Aligning Dialogue Model Behavior with Developer Guidelines
DialGuide: Aligning Dialogue Model Behavior with Developer GuidelinesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Prakhar Gupta
Yang Liu
Di Jin
Behnam Hedayatnia
Spandana Gella
Sijia Liu
P. Lange
Julia Hirschberg
Dilek Z. Hakkani-Tür
220
6
0
20 Dec 2022
DSI++: Updating Transformer Memory with New Documents
DSI++: Updating Transformer Memory with New DocumentsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Sanket Vaibhav Mehta
Jai Gupta
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
J. Rao
Marc Najork
Emma Strubell
Donald Metzler
CLL
223
60
0
19 Dec 2022
Talking About Large Language Models
Talking About Large Language ModelsCommunications of the ACM (CACM), 2022
Murray Shanahan
AI4CE
350
373
0
07 Dec 2022
Language Models as Agent Models
Language Models as Agent ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jacob Andreas
LLMAG
269
169
0
03 Dec 2022
Convexifying Transformers: Improving optimization and understanding of
  transformer networks
Convexifying Transformers: Improving optimization and understanding of transformer networks
Tolga Ergen
Behnam Neyshabur
Harsh Mehta
MLT
226
15
0
20 Nov 2022
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value
  Adaptors
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value AdaptorsNeural Information Processing Systems (NeurIPS), 2022
Thomas Hartvigsen
S. Sankaranarayanan
Hamid Palangi
Yoon Kim
Marzyeh Ghassemi
KELM
639
234
0
20 Nov 2022
Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
Diagnostics for Deep Neural Networks with Automated Copy/Paste Attacks
Stephen Casper
K. Hariharan
Dylan Hadfield-Menell
AAML
401
11
0
18 Nov 2022
DisentQA: Disentangling Parametric and Contextual Knowledge with
  Counterfactual Question Answering
DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Ella Neeman
Roee Aharoni
Or Honovich
Leshem Choshen
Idan Szpektor
Omri Abend
KELMCML
230
101
0
10 Nov 2022
On the Domain Adaptation and Generalization of Pretrained Language
  Models: A Survey
On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey
Xu Guo
Han Yu
LM&MAVLM
307
34
0
06 Nov 2022
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 smallInternational Conference on Learning Representations (ICLR), 2022
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
616
775
0
01 Nov 2022
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language
  Models
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language ModelsConference on Computational Natural Language Learning (CoNLL), 2022
Aaron Mueller
Yudi Xia
Tal Linzen
MILM
233
13
0
25 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning
  with Language Models
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELMOODLRM
384
78
0
21 Oct 2022
Previous
123...262728
Next