ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT

Locating and Editing Factual Associations in GPT

10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXivPDFHTML

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 924 papers shown
Title
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
54
1
0
03 Nov 2024
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
  Rare Concepts in Foundation Models
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Aashiq Muhamed
Mona Diab
Virginia Smith
38
2
0
01 Nov 2024
RESTOR: Knowledge Recovery through Machine Unlearning
RESTOR: Knowledge Recovery through Machine Unlearning
Keivan Rezaei
Khyathi Raghavi Chandu
S. Feizi
Yejin Choi
Faeze Brahman
Abhilasha Ravichander
KELM
CLL
MU
58
0
0
31 Oct 2024
Commonsense Knowledge Editing Based on Free-Text in LLMs
Commonsense Knowledge Editing Based on Free-Text in LLMs
Xiusheng Huang
Yequan Wang
Jun Zhao
Kang-Jun Liu
KELM
28
6
0
31 Oct 2024
Reasons and Solutions for the Decline in Model Performance after Editing
Reasons and Solutions for the Decline in Model Performance after Editing
Xiusheng Huang
Jiaxiang Liu
Yequan Wang
Kang-Jun Liu
KELM
47
4
0
31 Oct 2024
Attention Speaks Volumes: Localizing and Mitigating Bias in Language
  Models
Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models
Rishabh Adiga
Besmira Nushi
Varun Chandrasekaran
49
0
0
29 Oct 2024
Learning and Unlearning of Fabricated Knowledge in Language Models
Learning and Unlearning of Fabricated Knowledge in Language Models
Chen Sun
Nolan Miller
A. Zhmoginov
Max Vladymyrov
Mark Sandler
KELM
MU
30
1
0
29 Oct 2024
Survey of User Interface Design and Interaction Techniques in Generative
  AI Applications
Survey of User Interface Design and Interaction Techniques in Generative AI Applications
Reuben Luera
Ryan Rossi
Alexa F. Siu
Franck Dernoncourt
Tong Yu
...
Hanieh Salehy
Jian Zhao
Samyadeep Basu
Puneet Mathur
Nedim Lipka
AI4TS
60
1
0
28 Oct 2024
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From
  Syntax to Semantics
Causal Interventions on Causal Paths: Mapping GPT-2's Reasoning From Syntax to Semantics
Isabelle G. Lee
Joshua Lum
Ziyi Liu
Dani Yogatama
LRM
19
0
0
28 Oct 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with
  Annual Updates
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
36
3
0
28 Oct 2024
Applying sparse autoencoders to unlearn knowledge in language models
Applying sparse autoencoders to unlearn knowledge in language models
Eoin Farrell
Yeu-Tong Lau
Arthur Conmy
MU
35
14
0
25 Oct 2024
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate
  Hallucinations
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
Aryo Pradipta Gema
Chen Jin
Ahmed Abdulaal
Tom Diethe
Philip Teare
Beatrice Alex
Pasquale Minervini
Amrutha Saseendran
26
5
0
24 Oct 2024
Delving into the Reversal Curse: How Far Can Large Language Models
  Generalize?
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
Zhengkai Lin
Z. Fu
Kai Liu
Liang Xie
Binbin Lin
Wenxiao Wang
D. Cai
Yue Wu
Jieping Ye
LRM
25
3
0
24 Oct 2024
On Explaining with Attention Matrices
On Explaining with Attention Matrices
Omar Naim
Nicholas Asher
29
1
0
24 Oct 2024
Mixture of Parrots: Experts improve memorization more than reasoning
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi
Clara Mohri
David Brandfonbrener
Alex Gu
Nikhil Vyas
Nikhil Anand
David Alvarez-Melis
Yuanzhi Li
Sham Kakade
Eran Malach
MoE
30
4
0
24 Oct 2024
Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained
  Models via Model Editing
Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing
Dongliang Guo
Mengxuan Hu
Zihan Guan
Junfeng Guo
Thomas Hartvigsen
Sheng R. Li
AAML
26
0
0
23 Oct 2024
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
Jinghan Jia
Jiancheng Liu
Yihua Zhang
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
35
2
0
23 Oct 2024
DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy
  Conflicts in Large Language Models
DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models
Chen Qian
Dongrui Liu
Jie Zhang
Yong Liu
Jing Shao
37
1
0
22 Oct 2024
LLMScan: Causal Scan for LLM Misbehavior Detection
LLMScan: Causal Scan for LLM Misbehavior Detection
Mengdi Zhang
Kai Kiat Goh
Peixin Zhang
Jun Sun
23
0
0
22 Oct 2024
A Psycholinguistic Evaluation of Language Models' Sensitivity to
  Argument Roles
A Psycholinguistic Evaluation of Language Models' Sensitivity to Argument Roles
Eun-Kyoung Rosa Lee
Sathvik Nair
Naomi Feldman
60
4
0
21 Oct 2024
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Tian Gao
Amit Dhurandhar
K. Ramamurthy
Dennis L. Wei
43
0
0
21 Oct 2024
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao
Alessio Devoto
Giwon Hong
Xiaotang Du
Aryo Pradipta Gema
Hongru Wang
Xuanli He
Kam-Fai Wong
Pasquale Minervini
KELM
LLMSV
34
16
0
21 Oct 2024
Catastrophic Failure of LLM Unlearning via Quantization
Catastrophic Failure of LLM Unlearning via Quantization
Zhiwei Zhang
Fali Wang
Xiaomin Li
Zongyu Wu
Xianfeng Tang
Hui Liu
Qi He
Wenpeng Yin
Suhang Wang
MU
31
5
0
21 Oct 2024
Towards Faithful Natural Language Explanations: A Study Using Activation
  Patching in Large Language Models
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models
Wei Jie Yeo
Ranjan Satapathy
Erik Cambria
25
0
0
18 Oct 2024
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Denitsa Saynova
Lovisa Hagström
Moa Johansson
Richard Johansson
Marco Kuhlmann
HILM
39
0
0
18 Oct 2024
Active-Dormant Attention Heads: Mechanistically Demystifying
  Extreme-Token Phenomena in LLMs
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo
Druv Pai
Yu Bai
Jiantao Jiao
Michael I. Jordan
Song Mei
29
9
0
17 Oct 2024
Looking Inward: Language Models Can Learn About Themselves by
  Introspection
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder
James Chua
Tomek Korbak
Henry Sleight
John Hughes
Robert Long
Ethan Perez
Miles Turpin
Owain Evans
KELM
AIFin
LRM
35
12
0
17 Oct 2024
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
Dibyanayan Bandyopadhyay
Mohammed Hasanuzzaman
Asif Ekbal
AAML
29
0
0
17 Oct 2024
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Breaking Chains: Unraveling the Links in Multi-Hop Knowledge Unlearning
Minseok Choi
C. Park
Dohyun Lee
Jaegul Choo
KELM
MU
29
1
0
17 Oct 2024
On the Role of Attention Heads in Large Language Model Safety
On the Role of Attention Heads in Large Language Model Safety
Z. Zhou
Haiyang Yu
Xinghua Zhang
Rongwu Xu
Fei Huang
Kun Wang
Yang Liu
Junfeng Fang
Yongbin Li
57
5
0
17 Oct 2024
The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces
The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces
Ahmed Oumar El-Shangiti
Tatsuya Hiraoka
Hilal AlQuabeh
Benjamin Heinzerling
Kentaro Inui
42
1
0
17 Oct 2024
AERO: Softmax-Only LLMs for Efficient Private Inference
AERO: Softmax-Only LLMs for Efficient Private Inference
N. Jha
Brandon Reagen
27
1
0
16 Oct 2024
Bridging the Language Gaps in Large Language Models with Inference-Time
  Cross-Lingual Intervention
Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention
Weixuan Wang
Minghao Wu
Barry Haddow
Alexandra Birch
LRM
24
2
0
16 Oct 2024
Neuron-based Personality Trait Induction in Large Language Models
Neuron-based Personality Trait Induction in Large Language Models
Jia Deng
Tianyi Tang
Yanbin Yin
Wenhao Yang
Wayne Xin Zhao
Ji-Rong Wen
33
1
0
16 Oct 2024
SoK: Prompt Hacking of Large Language Models
SoK: Prompt Hacking of Large Language Models
Baha Rababah
Shang
Wu
Matthew Kwiatkowski
Carson Leung
Cuneyt Gurcan Akcora
AAML
38
2
0
16 Oct 2024
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
Shicheng Xu
Liang Pang
Yunchang Zhu
Huawei Shen
Xueqi Cheng
MLLM
36
1
0
16 Oct 2024
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Reconstruction of Differentially Private Text Sanitization via Large Language Models
Shuchao Pang
Zhigang Lu
H. Wang
Peng Fu
Yongbin Zhou
Minhui Xue
AAML
53
4
0
16 Oct 2024
The Persian Rug: solving toy models of superposition using large-scale
  symmetries
The Persian Rug: solving toy models of superposition using large-scale symmetries
Aditya Cowsik
Kfir Dolev
Alex Infanger
19
0
0
15 Oct 2024
O-Edit: Orthogonal Subspace Editing for Language Model Sequential
  Editing
O-Edit: Orthogonal Subspace Editing for Language Model Sequential Editing
Yuchen Cai
Ding Cao
KELM
21
2
0
15 Oct 2024
A Theoretical Survey on Foundation Models
A Theoretical Survey on Foundation Models
Shi Fu
Yuzhu Chen
Yingjie Wang
Dacheng Tao
23
0
0
15 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ZhongXiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
55
7
0
15 Oct 2024
Advancing the Understanding of Fixed Point Iterations in Deep Neural
  Networks: A Detailed Analytical Study
Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study
Yekun Ke
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
61
3
0
15 Oct 2024
Semantic Image Inversion and Editing using Rectified Stochastic
  Differential Equations
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
Litu Rout
Yujia Chen
Nataniel Ruiz
C. Caramanis
Sanjay Shakkottai
Wen-Sheng Chu
DiffM
59
0
0
14 Oct 2024
Parenting: Optimizing Knowledge Selection of Retrieval-Augmented
  Language Models with Parameter Decoupling and Tailored Tuning
Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning
Yongxin Xu
Ruizhe Zhang
Xinke Jiang
Yujie Feng
Yuzhen Xiao
Xinyu Ma
Runchuan Zhu
Xu Chu
Junfeng Zhao
Yasha Wang
KELM
22
4
0
14 Oct 2024
Locking Down the Finetuned LLMs Safety
Locking Down the Finetuned LLMs Safety
Minjun Zhu
Linyi Yang
Yifan Wei
Ningyu Zhang
Yue Zhang
34
8
0
14 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
30
5
0
14 Oct 2024
Safety-Aware Fine-Tuning of Large Language Models
Safety-Aware Fine-Tuning of Large Language Models
Hyeong Kyu Choi
Xuefeng Du
Yixuan Li
37
11
0
13 Oct 2024
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
Yein Park
Chanwoong Yoon
Jungwoo Park
Donghyeon Lee
Minbyul Jeong
Jaewoo Kang
KELM
56
1
0
13 Oct 2024
Inference and Verbalization Functions During In-Context Learning
Inference and Verbalization Functions During In-Context Learning
Junyi Tao
Xiaoyin Chen
Nelson F. Liu
ReLM
LRM
26
0
0
12 Oct 2024
Keys to Robust Edits: from Theoretical Insights to Practical Advances
Keys to Robust Edits: from Theoretical Insights to Practical Advances
Jianhao Yan
Futing Wang
Yun Luo
Yafu Li
Yue Zhang
KELM
26
0
0
12 Oct 2024
Previous
123456...171819
Next