ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,363 papers shown
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Tian Gao
Amit Dhurandhar
Karthikeyan N. Ramamurthy
Dennis L. Wei
385
1
0
21 Oct 2024
Catastrophic Failure of LLM Unlearning via Quantization
Catastrophic Failure of LLM Unlearning via QuantizationInternational Conference on Learning Representations (ICLR), 2024
Zhiwei Zhang
Fali Wang
Xiaomin Li
Zongyu Wu
Xianfeng Tang
Hui Liu
Qi He
Wenpeng Yin
Suhang Wang
MU
334
5
0
21 Oct 2024
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation EngineeringNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Yu Zhao
Alessio Devoto
Giwon Hong
Xiaotang Du
Aryo Pradipta Gema
Hongru Wang
Xuanli He
Kam-Fai Wong
Pasquale Minervini
KELMLLMSV
320
48
0
21 Oct 2024
Towards Faithful Natural Language Explanations: A Study Using Activation
  Patching in Large Language Models
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models
Wei Jie Yeo
Ranjan Satapathy
Erik Cambria
324
2
0
18 Oct 2024
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact CompletionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Denitsa Saynova
Lovisa Hagström
Moa Johansson
Richard Johansson
Marco Kuhlmann
HILM
622
2
0
18 Oct 2024
Active-Dormant Attention Heads: Mechanistically Demystifying
  Extreme-Token Phenomena in LLMs
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo
Druv Pai
Yu Bai
Jiantao Jiao
Michael I. Jordan
Song Mei
309
25
0
17 Oct 2024
Looking Inward: Language Models Can Learn About Themselves by
  Introspection
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder
James Chua
Tomek Korbak
Henry Sleight
John Hughes
Robert Long
Ethan Perez
Miles Turpin
Owain Evans
KELMAIFinLRM
256
41
0
17 Oct 2024
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
Seeing Through VisualBERT: A Causal Adventure on Memetic LandscapesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Dibyanayan Bandyopadhyay
Mohammed Hasanuzzaman
Asif Ekbal
AAML
266
5
0
17 Oct 2024
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Minseok Choi
C. Park
Dohyun Lee
Jaegul Choo
KELMMU
170
4
0
17 Oct 2024
On the Role of Attention Heads in Large Language Model Safety
On the Role of Attention Heads in Large Language Model SafetyInternational Conference on Learning Representations (ICLR), 2024
Zhenhong Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Cunchun Li
Yongbin Li
507
39
0
17 Oct 2024
The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces
The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear SubspacesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Ahmed Oumar El-Shangiti
Tatsuya Hiraoka
Hilal AlQuabeh
Benjamin Heinzerling
Kentaro Inui
433
4
0
17 Oct 2024
Bridging the Language Gaps in Large Language Models with Inference-Time
  Cross-Lingual Intervention
Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual InterventionAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Weixuan Wang
Minghao Wu
Barry Haddow
Alexandra Birch
LRM
237
15
0
16 Oct 2024
Neuron-based Personality Trait Induction in Large Language Models
Neuron-based Personality Trait Induction in Large Language Models
Jia Deng
Tianyi Tang
Yanbin Yin
Wenhao Yang
Wayne Xin Zhao
Ji-Rong Wen
252
4
0
16 Oct 2024
SoK: Prompt Hacking of Large Language Models
SoK: Prompt Hacking of Large Language ModelsBigData Congress [Services Society] (BSS), 2024
Baha Rababah
Shang
Wu
Matthew Kwiatkowski
Carson Leung
Cuneyt Gurcan Akcora
AAML
172
6
0
16 Oct 2024
Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey
Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey
A. Khan
Todd Nief
Nathaniel Hudson
Mansi Sakarvadia
Daniel Grzenda
Aswathy Ajith
Jordan Pettyjohn
Kyle Chard
Ian Foster
MoMe
205
1
0
16 Oct 2024
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
Cross-Modal Safety Mechanism Transfer in Large Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Shicheng Xu
Liang Pang
Yunchang Zhu
Huawei Shen
Xueqi Cheng
MLLM
303
14
0
16 Oct 2024
Interpreting token compositionality in LLMs: A robustness analysis
Interpreting token compositionality in LLMs: A robustness analysis
Nura Aljaafari
Danilo S. Carvalho
André Freitas
440
4
0
16 Oct 2024
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Shuchao Pang
Zhigang Lu
Jian Shu
Peng Fu
Yongbin Zhou
Minhui Xue
AAML
431
6
0
16 Oct 2024
AERO: Entropy-Guided Framework for Private LLM Inference
AERO: Entropy-Guided Framework for Private LLM Inference
N. Jha
Brandon Reagen
493
5
0
16 Oct 2024
The Persian Rug: solving toy models of superposition using large-scale
  symmetries
The Persian Rug: solving toy models of superposition using large-scale symmetries
Aditya Cowsik
Kfir Dolev
Alex Infanger
217
0
0
15 Oct 2024
O-Edit: Orthogonal Subspace Editing for Language Model Sequential
  Editing
O-Edit: Orthogonal Subspace Editing for Language Model Sequential Editing
Yuchen Cai
Ding Cao
KELM
228
5
0
15 Oct 2024
A Theoretical Survey on Foundation Models
A Theoretical Survey on Foundation Models
Shi Fu
Yuzhu Chen
Yingjie Wang
Dacheng Tao
304
0
0
15 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic InterpretabilityInternational Conference on Learning Representations (ICLR), 2024
Zhongxiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
313
63
0
15 Oct 2024
Advancing the Understanding of Fixed Point Iterations in Deep Neural
  Networks: A Detailed Analytical Study
Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
240
5
0
15 Oct 2024
Semantic Image Inversion and Editing using Rectified Stochastic
  Differential Equations
Semantic Image Inversion and Editing using Rectified Stochastic Differential EquationsInternational Conference on Learning Representations (ICLR), 2024
Litu Rout
Yujia Chen
Nataniel Ruiz
Constantine Caramanis
Sanjay Shakkottai
Wen-Sheng Chu
DiffM
215
0
0
14 Oct 2024
Locking Down the Finetuned LLMs Safety
Locking Down the Finetuned LLMs Safety
Minjun Zhu
Linyi Yang
Yifan Wei
Ningyu Zhang
Yue Zhang
279
21
0
14 Oct 2024
Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning
Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning
Yongxin Xu
Ruizhe Zhang
Xinke Jiang
Yujie Feng
Yuzhen Xiao
Xinyu Ma
Runchuan Zhu
Xu Chu
Junfeng Zhao
Yasha Wang
KELM
276
11
0
14 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsInternational Conference on Learning Representations (ICLR), 2024
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
315
11
0
14 Oct 2024
Safety-Aware Fine-Tuning of Large Language Models
Safety-Aware Fine-Tuning of Large Language Models
Hyeong Kyu Choi
Xuefeng Du
Yixuan Li
278
34
0
13 Oct 2024
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple DomainsInternational Conference on Learning Representations (ICLR), 2024
Yein Park
Chanwoong Yoon
Jungwoo Park
Donghyeon Lee
Minbyul Jeong
Jaewoo Kang
KELM
503
3
0
13 Oct 2024
Inference and Verbalization Functions During In-Context Learning
Inference and Verbalization Functions During In-Context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Junyi Tao
Xiaoyin Chen
Nelson F. Liu
LRMReLM
304
1
0
12 Oct 2024
Keys to Robust Edits: from Theoretical Insights to Practical Advances
Keys to Robust Edits: from Theoretical Insights to Practical AdvancesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Jianhao Yan
Futing Wang
Yun Luo
Yafu Li
Yue Zhang
KELM
290
1
0
12 Oct 2024
CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
CollabEdit: Towards Non-destructive Collaborative Knowledge EditingInternational Conference on Learning Representations (ICLR), 2024
Jiamu Zheng
Jinghuai Zhang
Xuhong Zhang
Xuhong Zhang
Jianwei Yin
Tao Lin
KELM
540
0
0
12 Oct 2024
Understanding the Interplay between Parametric and Contextual Knowledge
  for Large Language Models
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng
Liangming Pan
Xunjian Yin
Xinyi Wang
William Yang Wang
KELM
242
10
0
10 Oct 2024
Mitigating Gender Bias in Code Large Language Models via Model Editing
Mitigating Gender Bias in Code Large Language Models via Model Editing
Zhan Qin
Haochuan Wang
Zecheng Wang
Deyuan Liu
Cunhang Fan
Zhao Lv
Zhiying Tu
Dianhui Chu
Dianbo Sui
KELM
200
3
0
10 Oct 2024
Unlearning-based Neural Interpretations
Unlearning-based Neural InterpretationsInternational Conference on Learning Representations (ICLR), 2024
Ching Lam Choi
Alexandre Duplessis
Serge Belongie
FAtt
599
0
0
10 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
337
35
0
10 Oct 2024
Uncovering Overfitting in Large Language Model Editing
Uncovering Overfitting in Large Language Model EditingInternational Conference on Learning Representations (ICLR), 2024
Mengqi Zhang
Xiaotian Ye
Qiang Liu
Sudipta Singha Roy
Shu Wu
Zhumin Chen
KELM
299
26
0
10 Oct 2024
Mitigating the Language Mismatch and Repetition Issues in LLM-based
  Machine Translation via Model Editing
Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model EditingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Weichuan Wang
Zhaoyi Li
Defu Lian
Chen Ma
Linqi Song
Ying Wei
225
16
0
09 Oct 2024
Towards Universality: Studying Mechanistic Similarity Across Language
  Model Architectures
Towards Universality: Studying Mechanistic Similarity Across Language Model ArchitecturesInternational Conference on Learning Representations (ICLR), 2024
Junxuan Wang
Xuyang Ge
Wentao Shu
Qiong Tang
Yunhua Zhou
Zhengfu He
Xipeng Qiu
252
17
0
09 Oct 2024
Dissecting Fine-Tuning Unlearning in Large Language Models
Dissecting Fine-Tuning Unlearning in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yihuai Hong
Yuelin Zou
Lijie Hu
Huiping Zhuang
Di Wang
Shauli Ravfogel
AAMLMU
241
11
0
09 Oct 2024
On the Similarity of Circuits across Languages: a Case Study on the
  Subject-verb Agreement Task
On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement TaskConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Javier Ferrando
Marta R. Costa-jussá
167
17
0
09 Oct 2024
Towards Interpreting Visual Information Processing in Vision-Language Models
Towards Interpreting Visual Information Processing in Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Philip Quirke
Luke Ong
Juil Sock
Mor Geva
David M. Krueger
Fazl Barez
545
49
0
09 Oct 2024
Jet Expansions of Residual Computation
Jet Expansions of Residual Computation
Yihong Chen
Xiangxiang Xu
Yao Lu
Pontus Stenetorp
Luca Franceschi
194
4
0
08 Oct 2024
Probing Language Models on Their Knowledge Source
Probing Language Models on Their Knowledge SourceBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Zineddine Tighidet
Andrea Mogini
Jiali Mei
Benjamin Piwowarski
Patrick Gallinari
KELM
210
9
0
08 Oct 2024
From Tokens to Words: On the Inner Lexicon of LLMs
From Tokens to Words: On the Inner Lexicon of LLMsInternational Conference on Learning Representations (ICLR), 2024
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
460
30
0
08 Oct 2024
Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing
Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing
Zhuoran Zhang
Yongqian Li
Zijian Kan
Keyuan Cheng
Lijie Hu
Di Wang
KELM
432
26
0
08 Oct 2024
Attribute Controlled Fine-tuning for Large Language Models: A Case Study
  on Detoxification
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on DetoxificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tao Meng
Ninareh Mehrabi
Palash Goyal
Anil Ramakrishna
Aram Galstyan
Richard Zemel
Kai-Wei Chang
Rahul Gupta
Charith Peris
136
5
0
07 Oct 2024
Deciphering the Interplay of Parametric and Non-parametric Memory in
  Retrieval-augmented Language Models
Deciphering the Interplay of Parametric and Non-parametric Memory in Retrieval-augmented Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
M. Farahani
Richard Johansson
RALM
223
6
0
07 Oct 2024
Mechanistic?
Mechanistic?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Naomi Saphra
Sarah Wiegreffe
AI4CE
263
35
0
07 Oct 2024
Previous
123...131415...262728
Next
Page 14 of 28
Pageof 28