ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT

Locating and Editing Factual Associations in GPT

10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXivPDFHTML

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 924 papers shown
Title
Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates
Rethinking Circuit Completeness in Language Models: AND, OR, and ADDER Gates
Hang Chen
Jiaying Zhu
Xinyu Yang
Wenya Wang
LRM
7
0
0
15 May 2025
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs
Jingcheng Niu
Xingdi Yuan
Tong Wang
Hamidreza Saghir
Amir H. Abdi
22
0
0
14 May 2025
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
25
0
0
13 May 2025
Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation
Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation
Chiara Manna
Afra Alishahi
Frédéric Blain
Eva Vanmassenhove
22
0
0
13 May 2025
DeltaEdit: Enhancing Sequential Editing in Large Language Models by Controlling Superimposed Noise
DeltaEdit: Enhancing Sequential Editing in Large Language Models by Controlling Superimposed Noise
Ding Cao
Yuchen Cai
Rongxi Guo
X. He
Guiquan Liu
KELM
42
0
0
12 May 2025
Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification
Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification
Leon Eshuijs
Shihan Wang
Antske Fokkens
26
0
0
09 May 2025
Understanding In-context Learning of Addition via Activation Subspaces
Understanding In-context Learning of Addition via Activation Subspaces
Xinyan Hu
Kayo Yin
Michael I. Jordan
Jacob Steinhardt
Lijie Chen
51
0
0
08 May 2025
Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization
Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization
Yuntai Bao
Xuhong Zhang
Tianyu Du
Xinkui Zhao
Jiang Zong
Hao Peng
Jianwei Yin
TDI
45
0
0
08 May 2025
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
Xiaoyu Xu
Minxin Du
Qingqing Ye
Haibo Hu
MU
52
0
0
07 May 2025
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAML
SILM
54
0
0
07 May 2025
Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability
Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability
Dip Roy
CML
17
0
0
06 May 2025
Interpreting Multilingual and Document-Length Sensitive Relevance Computations in Neural Retrieval Models through Axiomatic Causal Interventions
Interpreting Multilingual and Document-Length Sensitive Relevance Computations in Neural Retrieval Models through Axiomatic Causal Interventions
Oliver Savolainen
Dur e Najaf Amjad
Roxana Petcu
AAML
23
0
0
04 May 2025
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Vaidehi Patil
Yi-Lin Sung
Peter Hase
Jie Peng
Tianlong Chen
Mohit Bansal
AAML
MU
81
3
0
01 May 2025
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions
Yiming Du
Wenyu Huang
Danna Zheng
Zhaowei Wang
Sébastien Montella
Mirella Lapata
Kam-Fai Wong
Jeff Z. Pan
KELM
MU
78
2
0
01 May 2025
Memorization and Knowledge Injection in Gated LLMs
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELM
CLL
RALM
104
0
0
30 Apr 2025
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition
Zhengfu He
J. Wang
Rui Lin
Xuyang Ge
Wentao Shu
Qiong Tang
J. Zhang
Xipeng Qiu
70
0
0
29 Apr 2025
SetKE: Knowledge Editing for Knowledge Elements Overlap
SetKE: Knowledge Editing for Knowledge Elements Overlap
Yifan Wei
Xiaoyan Yu
Ran Song
Hao Peng
Angsheng Li
KELM
53
0
0
29 Apr 2025
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Mirian Hipolito Garcia
Camille Couturier
Daniel Madrigal Diaz
Ankur Mallick
Anastasios Kyrillidis
Robert Sim
Victor Rühle
Saravan Rajmohan
30
0
0
23 Apr 2025
Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement
Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement
Xiaowei Yuan
Zhao Yang
Ziyang Huang
Y. Wang
Siqi Fan
Yiming Ju
Jun Zhao
Kang-Jun Liu
27
0
0
22 Apr 2025
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
Aviv Bick
Eric P. Xing
Albert Gu
RALM
86
0
0
22 Apr 2025
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler A. Chang
Benjamin Bergen
48
0
0
21 Apr 2025
Functional Abstraction of Knowledge Recall in Large Language Models
Functional Abstraction of Knowledge Recall in Large Language Models
Zijian Wang
Chang Xu
KELM
32
0
0
20 Apr 2025
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
Linda He
Jue Wang
Maurice Weber
Shang Zhu
Ben Athiwaratkun
Ce Zhang
SyDa
LRM
42
0
0
17 Apr 2025
GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs
GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs
Kun-Woo Kim
Ji-Hoon Park
Ju-Min Han
Seong-Whan Lee
MU
PILM
62
0
0
17 Apr 2025
SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation
SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation
Saransh Agrawal
Kuan-Hao Huang
MU
KELM
54
0
0
17 Apr 2025
MIB: A Mechanistic Interpretability Benchmark
MIB: A Mechanistic Interpretability Benchmark
Aaron Mueller
Atticus Geiger
Sarah Wiegreffe
Dana Arad
Iván Arcuschin
...
Alessandro Stolfo
Martin Tutek
Amir Zur
David Bau
Yonatan Belinkov
41
1
0
17 Apr 2025
Localized Cultural Knowledge is Conserved and Controllable in Large Language Models
Localized Cultural Knowledge is Conserved and Controllable in Large Language Models
V. Veselovsky
Berke Argin
Benedikt Stroebl
Chris Wendler
Robert West
James Evans
Thomas L. Griffiths
Arvind Narayanan
53
0
0
14 Apr 2025
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
Saif Punjwani
Larry Heck
LRM
27
0
0
14 Apr 2025
Towards Quantifying Commonsense Reasoning with Mechanistic Insights
Towards Quantifying Commonsense Reasoning with Mechanistic Insights
Abhinav Joshi
A. Ahmad
Divyaksh Shukla
Ashutosh Modi
ReLM
LRM
34
0
0
14 Apr 2025
Can We Edit LLMs for Long-Tail Biomedical Knowledge?
Can We Edit LLMs for Long-Tail Biomedical Knowledge?
Xinhao Yi
Jake Lever
Kevin Bryson
Zaiqiao Meng
KELM
22
0
0
14 Apr 2025
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
Hanqi Xiao
Yi-Lin Sung
Elias Stengel-Eskin
Mohit Bansal
MQ
33
0
0
10 Apr 2025
How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective
How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective
Qi Liu
Jiaxin Mao
Ji-Rong Wen
LRM
29
0
0
10 Apr 2025
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Yixin Cao
Jiahao Ying
Y. Wang
Xipeng Qiu
Xuanjing Huang
Yugang Jiang
ELM
30
2
0
10 Apr 2025
On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions
On the Effectiveness and Generalization of Race Representations for Debiasing High-Stakes Decisions
Dang Nguyen
Chenhao Tan
32
0
0
07 Apr 2025
Steering off Course: Reliability Challenges in Steering Language Models
Steering off Course: Reliability Challenges in Steering Language Models
Patrick Queiroz Da Silva
Hari Sethuraman
Dheeraj Rajagopal
Hannaneh Hajishirzi
Sachin Kumar
LLMSV
29
1
0
06 Apr 2025
Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models
Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models
Mingyang Wang
Heike Adel
Lukas Lange
Yihong Liu
Ercong Nie
Jannik Strötgen
Hinrich Schütze
HILM
56
0
0
05 Apr 2025
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
Kazuki Yano
Takumi Ito
Jun Suzuki
LRM
47
1
0
05 Apr 2025
Noiser: Bounded Input Perturbations for Attributing Large Language Models
Noiser: Bounded Input Perturbations for Attributing Large Language Models
Mohammad Reza Ghasemi Madani
Aryo Pradipta Gema
Gabriele Sarti
Yu Zhao
Pasquale Minervini
Andrea Passerini
AAML
30
0
0
03 Apr 2025
Page Classification for Print Imaging Pipeline
Page Classification for Print Imaging Pipeline
Shaoyuan Xu
Cheng Lu
Mark Shaw
Peter Bauer
J. Allebach
VLM
38
0
0
03 Apr 2025
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
Hongzhe Du
Weikai Li
Min Cai
Karim Saraipour
Zimin Zhang
Himabindu Lakkaraju
Yizhou Sun
Shichang Zhang
KELM
51
0
0
03 Apr 2025
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
Bowen Cao
Deng Cai
W. Lam
CLL
46
0
0
02 Apr 2025
Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure
Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure
Boshi Wang
Huan Sun
34
2
0
02 Apr 2025
Forward Learning with Differential Privacy
Forward Learning with Differential Privacy
Mingqian Feng
Zeliang Zhang
Jinyang Jiang
Yijie Peng
Chenliang Xu
39
0
0
01 Apr 2025
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang
Y. Zhang
Yao Zhu
Jianing Li
Zizhe Wang
Y. Liu
Xiangyang Ji
105
0
0
31 Mar 2025
Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B
Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B
Aleksandra Bakalova
Yana Veitsman
Xinting Huang
Michael Hahn
31
0
0
31 Mar 2025
Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language Models
Leaking LoRa: An Evaluation of Password Leaks and Knowledge Storage in Large Language Models
Ryan Marinelli
Magnus Eckhoff
PILM
52
0
0
29 Mar 2025
Effective Skill Unlearning through Intervention and Abstention
Effective Skill Unlearning through Intervention and Abstention
Yongce Li
Chung-En Sun
Tsui-Wei Weng
MU
125
0
0
27 Mar 2025
How do language models learn facts? Dynamics, curricula and hallucinations
How do language models learn facts? Dynamics, curricula and hallucinations
Nicolas Zucchet
J. Bornschein
Stephanie C. Y. Chan
Andrew Kyle Lampinen
Razvan Pascanu
Soham De
KELM
HILM
LRM
77
2
1
27 Mar 2025
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Chenxi Wang
Jizhan Fang
Xiang Chen
Bozhong Tian
Ziwen Xu
H. Chen
N. Zhang
KELM
92
0
0
26 Mar 2025
Interpretable Generative Models through Post-hoc Concept Bottlenecks
Interpretable Generative Models through Post-hoc Concept Bottlenecks
Akshay Kulkarni
Ge Yan
Chung-En Sun
Tuomas P. Oikarinen
Tsui-Wei Weng
39
0
0
25 Mar 2025
1234...171819
Next