ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT

Locating and Editing Factual Associations in GPT

10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXivPDFHTML

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 924 papers shown
Title
On the Robustness of Transformers against Context Hijacking for Linear Classification
On the Robustness of Transformers against Context Hijacking for Linear Classification
Tianle Li
Chenyang Zhang
Xingwu Chen
Yuan Cao
Difan Zou
67
0
0
24 Feb 2025
PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning
PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning
Pengcheng Huang
Zhenghao Liu
Yukun Yan
Xiaoyuan Yi
Hao Chen
Zhiyuan Liu
Maosong Sun
Tong Xiao
Ge Yu
Chenyan Xiong
96
1
0
24 Feb 2025
Do Multilingual LLMs Think In English?
Do Multilingual LLMs Think In English?
Lisa Schut
Y. Gal
Sebastian Farquhar
42
3
0
24 Feb 2025
Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems
Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems
Tianjie Ju
B. Wang
Hao Fei
M. Lee
W. Hsu
...
Qianren Wang
Pengzhou Cheng
Zongru Wu
Zhuosheng Zhang
Gongshen Liu
AAML
36
0
0
24 Feb 2025
Memory Helps, but Confabulation Misleads: Understanding Streaming Events in Videos with MLLMs
Memory Helps, but Confabulation Misleads: Understanding Streaming Events in Videos with MLLMs
Gengyuan Zhang
Mingcong Ding
Tong Liu
Yao Zhang
Volker Tresp
74
1
0
24 Feb 2025
CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale
CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale
Chenlong Wang
Zhaoyang Chu
Zhengxiang Cheng
Xuyi Yang
Kaiyue Qiu
Yao Wan
Zhou Zhao
Xuanhua Shi
D. Z. Chen
ALM
SyDa
41
0
0
23 Feb 2025
From Text to Space: Mapping Abstract Spatial Models in LLMs during a Grid-World Navigation Task
From Text to Space: Mapping Abstract Spatial Models in LLMs during a Grid-World Navigation Task
Nicolas Martorell
LLMAG
52
1
0
23 Feb 2025
Interrogating LLM design under a fair learning doctrine
Interrogating LLM design under a fair learning doctrine
Johnny Tian-Zheng Wei
Maggie Wang
Ameya Godbole
Jonathan H. Choi
Robin Jia
27
0
0
22 Feb 2025
Activation Steering in Neural Theorem Provers
Activation Steering in Neural Theorem Provers
Shashank Kirtania
LLMSV
146
0
0
21 Feb 2025
Revealing and Mitigating Over-Attention in Knowledge Editing
Revealing and Mitigating Over-Attention in Knowledge Editing
Pinzheng Wang
Zecheng Tang
Keyan Zhou
J. Li
Qiaoming Zhu
M. Zhang
KELM
115
2
0
21 Feb 2025
A Close Look at Decomposition-based XAI-Methods for Transformer Language Models
A Close Look at Decomposition-based XAI-Methods for Transformer Language Models
L. Arras
Bruno Puri
Patrick Kahardipraja
Sebastian Lapuschkin
Wojciech Samek
41
0
0
21 Feb 2025
BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning
BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning
Ercong Nie
Bo Shao
Zifeng Ding
Mingyang Wang
Helmut Schmid
Hinrich Schütze
KELM
129
7
0
21 Feb 2025
Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models
Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models
Haokun Chen
Sebastian Szyller
Weilin Xu
N. Himayat
MU
AAML
41
0
0
20 Feb 2025
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Anton Razzhigaev
Matvey Mikhalchuk
Temurbek Rahmatullaev
Elizaveta Goncharova
Polina Druzhinina
Ivan V. Oseledets
Andrey Kuznetsov
59
2
0
20 Feb 2025
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
Vaidehi Patil
Elias Stengel-Eskin
Mohit Bansal
MU
CLL
73
2
0
20 Feb 2025
CoME: An Unlearning-based Approach to Conflict-free Model Editing
CoME: An Unlearning-based Approach to Conflict-free Model Editing
Dahyun Jung
Jaehyung Seo
Jaewook Lee
Chanjun Park
Heuiseok Lim
MU
KELM
47
0
0
20 Feb 2025
MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models
MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models
Zihao Wei
Jingcheng Deng
Liang Pang
Hanxing Ding
Huawei Shen
Xueqi Cheng
KELM
81
4
0
20 Feb 2025
Mechanistic Understanding of Language Models in Syntactic Code Completion
Mechanistic Understanding of Language Models in Syntactic Code Completion
Samuel Miller
Daking Rai
Ziyu Yao
LRM
44
0
0
20 Feb 2025
Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare
Hiba Ahsan
Arnab Sen Sharma
Silvio Amir
David Bau
Byron C. Wallace
83
0
0
20 Feb 2025
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons
Yuheng Chen
Pengfei Cao
Kang Liu
Jun Zhao
43
0
0
18 Feb 2025
Precise Parameter Localization for Textual Generation in Diffusion Models
Precise Parameter Localization for Textual Generation in Diffusion Models
Łukasz Staniszewski
Bartosz Cywiñski
Franziska Boenisch
Kamil Deja
Adam Dziedzic
DiffM
146
0
0
17 Feb 2025
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Shichang Zhang
Tessa Han
Usha Bhalla
Hima Lakkaraju
FAtt
147
0
0
17 Feb 2025
Hyperspherical Energy Transformer with Recurrent Depth
Yunzhe Hu
Difan Zou
Dong Xu
41
0
0
17 Feb 2025
Exploring Translation Mechanism of Large Language Models
Exploring Translation Mechanism of Large Language Models
Hongbin Zhang
Kehai Chen
Xuefeng Bai
Xiucheng Li
Yang Xiang
Min Zhang
59
1
0
17 Feb 2025
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
L. Zhang
Lijie Hu
Di Wang
LRM
90
0
0
17 Feb 2025
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
Yi Wang
Fenghua Weng
S. Yang
Zhan Qin
Minlie Huang
Wenjie Wang
KELM
AAML
48
0
0
17 Feb 2025
Sparse Autoencoder Features for Classifications and Transferability
Sparse Autoencoder Features for Classifications and Transferability
Jack Gallifant
Shan Chen
Kuleen Sasse
Hugo J. W. L. Aerts
Thomas Hartvigsen
Danielle S. Bitterman
41
3
0
17 Feb 2025
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis
Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis
X. Wang
Yan Hu
Wenyu Du
Reynold Cheng
Benyou Wang
Difan Zou
54
0
0
17 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
54
1
0
16 Feb 2025
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
Yixin Ou
Yunzhi Yao
N. Zhang
Hui Jin
Jiacheng Sun
Shumin Deng
Z. Li
H. Chen
KELM
CLL
49
0
0
16 Feb 2025
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang
Caigao Jiang
Zhaoyi Li
Siqiao Xue
Jun-ping Zhou
Linqi Song
Defu Lian
Yin Wei
CLL
MU
56
0
0
16 Feb 2025
Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
Somnath Banerjee
Sayan Layek
Pratyush Chatterjee
Animesh Mukherjee
Rima Hazra
LLMSV
71
0
0
16 Feb 2025
Superpose Singular Features for Model Merging
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
43
0
0
15 Feb 2025
MEMIT-Merge: Addressing MEMIT's Key-Value Conflicts in Same-Subject Batch Editing for LLMs
MEMIT-Merge: Addressing MEMIT's Key-Value Conflicts in Same-Subject Batch Editing for LLMs
Zilu Dong
Xiangqing Shen
Rui Xia
KELM
73
1
0
11 Feb 2025
LUNAR: LLM Unlearning via Neural Activation Redirection
LUNAR: LLM Unlearning via Neural Activation Redirection
William F. Shen
Xinchi Qiu
Meghdad Kurmanji
Alex Iacob
Lorenzo Sani
Yihong Chen
Nicola Cancedda
Nicholas D. Lane
MU
51
1
0
11 Feb 2025
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models
Samuel Stevens
Wei-Lun Chao
T. Berger-Wolf
Yu-Chuan Su
VLM
72
2
0
10 Feb 2025
Reinforced Lifelong Editing for Language Models
Reinforced Lifelong Editing for Language Models
Zherui Li
Houcheng Jiang
Hao Chen
Baolong Bi
Z. Zhou
Fei Sun
Junfeng Fang
X. Wang
KELM
51
5
0
09 Feb 2025
AnyEdit: Edit Any Knowledge Encoded in Language Models
AnyEdit: Edit Any Knowledge Encoded in Language Models
Houcheng Jiang
Junfeng Fang
Ningyu Zhang
Guojun Ma
Mingyang Wan
X. Wang
Xiangnan He
Tat-Seng Chua
KELM
49
8
0
08 Feb 2025
Mechanistic Interpretability of Emotion Inference in Large Language Models
Mechanistic Interpretability of Emotion Inference in Large Language Models
Ala Nekouvaght Tak
Amin Banayeeanzade
Anahita Bolourani
Mina Kian
Robin Jia
Jonathan Gratch
49
0
0
08 Feb 2025
Learning Task Representations from In-Context Learning
Learning Task Representations from In-Context Learning
Baturay Saglam
Zhuoran Yang
Dionysis Kalogerias
Amin Karbasi
55
0
0
08 Feb 2025
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
LRM
34
0
0
05 Feb 2025
What is a Number, That a Large Language Model May Know It?
What is a Number, That a Large Language Model May Know It?
Raja Marjieh
Veniamin Veselovsky
Thomas L. Griffiths
Ilia Sucholutsky
149
2
0
03 Feb 2025
Lifelong Sequential Knowledge Editing without Model Degradation
Lifelong Sequential Knowledge Editing without Model Degradation
Akshat Gupta
Phudish Prateepamornkul
Maochuan Lu
Ahmed Alaa
Thomas Hartvigsen
Gopala Anumanchipalli
KELM
70
1
0
03 Feb 2025
On The Truthfulness of 'Surprisingly Likely' Responses of Large Language Models
On The Truthfulness of 'Surprisingly Likely' Responses of Large Language Models
Naman Goel
HILM
57
0
0
28 Jan 2025
An Attempt to Unraveling Token Prediction Refinement and Identifying Essential Layers of Large Language Models
Jaturong Kongmanee
34
1
0
28 Jan 2025
Risk-Aware Distributional Intervention Policies for Language Models
Bao Nguyen
Binh Nguyen
Duy Nguyen
V. Nguyen
30
1
0
28 Jan 2025
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
105
99
0
28 Jan 2025
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu
Sophia Ananiadou
KELM
43
1
0
24 Jan 2025
LLMs as Repositories of Factual Knowledge: Limitations and Solutions
Seyed Mahed Mousavi
Simone Alghisi
Giuseppe Riccardi
KELM
47
0
0
22 Jan 2025
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Alexis Huet
Zied Ben-Houidi
Dario Rossi
LLMAG
54
0
0
21 Jan 2025
Previous
123456...171819
Next