ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.05262
  4. Cited By
Locating and Editing Factual Associations in GPT
v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022
10 February 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
    KELM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,361 papers shown
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
Erik Jenner
Shreyas Kapur
Vasil Georgiev
Cameron Allen
Scott Emmons
Stuart J. Russell
344
20
0
02 Jun 2024
DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large
  Language Models
DAFNet: Dynamic Auxiliary Fusion for Sequential Model Editing in Large Language Models
Taolin Zhang
Qizhou Chen
Dongyang Li
Chengyu Wang
Xiaofeng He
Longtao Huang
Hui Xue
Junyuan Huang
CLLKELM
247
8
0
31 May 2024
Mind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs' Refusal Boundaries
Mind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs' Refusal Boundaries
Jiahao Yu
Haozheng Luo
Jerry Yao-Chieh Hu
Wenbo Guo
Han Liu
Xinyu Xing
329
21
0
31 May 2024
Contextual Counting: A Mechanistic Study of Transformers on a
  Quantitative Task
Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task
Siavash Golkar
Alberto Bietti
Mariel Pettee
Michael Eickenberg
M. Cranmer
...
Ruben Ohana
Liam Parker
Bruno Régaldo-Saint Blancard
Kyunghyun Cho
Shirley Ho
186
5
0
30 May 2024
TAIA: Large Language Models are Out-of-Distribution Data Learners
TAIA: Large Language Models are Out-of-Distribution Data Learners
Shuyang Jiang
Yusheng Liao
Ya Zhang
Yu Wang
Yanfeng Wang
229
7
0
30 May 2024
Knowledge Graph Tuning: Real-time Large Language Model Personalization
  based on Human Feedback
Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback
Jingwei Sun
Zhixu Du
Yiran Chen
KELM
256
4
0
30 May 2024
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors
Renzhi Wang
Piji Li
KELM
285
7
0
29 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large
  Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong Liu
Ruiming Tang
KELM
246
7
0
29 May 2024
Semantic are Beacons: A Semantic Perspective for Unveiling
  Parameter-Efficient Fine-Tuning in Knowledge Learning
Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning
Renzhi Wang
Piji Li
184
5
0
28 May 2024
Knowledge Circuits in Pretrained Transformers
Knowledge Circuits in Pretrained Transformers
Yunzhi Yao
Ningyu Zhang
Zekun Xi
Meng Wang
Ziwen Xu
Shumin Deng
Huajun Chen
KELM
438
43
0
28 May 2024
Improved Generation of Adversarial Examples Against Safety-aligned LLMs
Improved Generation of Adversarial Examples Against Safety-aligned LLMs
Qizhang Li
Yiwen Guo
Wangmeng Zuo
Hao Chen
AAMLSILM
243
12
0
28 May 2024
InversionView: A General-Purpose Method for Reading Information from
  Neural Activations
InversionView: A General-Purpose Method for Reading Information from Neural Activations
Xinting Huang
Madhur Panwar
Navin Goyal
Michael Hahn
359
9
0
27 May 2024
Balancing User Preferences by Social Networks: A Condition-Guided Social
  Recommendation Model for Mitigating Popularity Bias
Balancing User Preferences by Social Networks: A Condition-Guided Social Recommendation Model for Mitigating Popularity Bias
Xingbo He
Wenqi Fan
Ruobing Wang
Yili Wang
Ying Wang
Shirui Pan
Xin Wang
CML
237
7
0
27 May 2024
Cross-Modal Safety Alignment: Is textual unlearning all you need?
Cross-Modal Safety Alignment: Is textual unlearning all you need?
Trishna Chakraborty
Erfan Shayegani
Zikui Cai
Nael B. Abu-Ghazaleh
M. Salman Asif
Yue Dong
Amit K. Roy-Chowdhury
Chengyu Song
252
23
0
27 May 2024
Perturbation-Restrained Sequential Model Editing
Perturbation-Restrained Sequential Model Editing
Junjie Ma
Hong Wang
Haoyang Xu
Zhen-Hua Ling
Jia-Chen Gu
KELM
520
16
0
27 May 2024
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
Tianlong Wang
Xianfeng Jiao
Yifan He
Zhongzhi Chen
Yinghao Zhu
Xu Chu
Junyi Gao
Yasha Wang
Liantao Ma
LLMSV
444
51
0
26 May 2024
Large Scale Knowledge Washing
Large Scale Knowledge Washing
Yu Wang
Ruihan Wu
Zexue He
Xinyu Chen
Julian McAuley
MUKELM
426
13
0
26 May 2024
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top
Keyuan Cheng
Muhammad Asif Ali
Shu Yang
Gang Lin
Yuxuan Zhai
Haoyang Fei
Ke Xu
Lu Yu
Lijie Hu
Haiyan Zhao
KELM
326
11
0
24 May 2024
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Jingcheng Deng
Zihao Wei
Liang Pang
Hanxing Ding
Huawei Shen
Xueqi Cheng
KELM
238
2
0
24 May 2024
Sparse Matrix in Large Language Model Fine-tuning
Sparse Matrix in Large Language Model Fine-tuning
Haoze He
Juncheng Billy Li
Xuan Jiang
Heather Miller
MoE
313
8
0
24 May 2024
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Emily Cheng
Diego Doimo
Corentin Kervadec
Iuri Macocco
Jade Yu
Alessandro Laio
Marco Baroni
704
32
0
24 May 2024
Linearly Controlled Language Generation with Performative Guarantees
Linearly Controlled Language Generation with Performative Guarantees
Emily Cheng
Marco Baroni
378
13
0
24 May 2024
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to
  the Edge of Generalization
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Boshi Wang
Xiang Yue
Yu-Chuan Su
Huan Sun
LRM
382
75
0
23 May 2024
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Bernal Jiménez Gutiérrez
Yiheng Shu
Yu Gu
Michihiro Yasunaga
Yu-Chuan Su
RALMCLL
370
116
0
23 May 2024
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of
  Large Language Models
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Peng Wang
Zexi Li
Ningyu Zhang
Ziwen Xu
Yunzhi Yao
Yong Jiang
Pengjun Xie
Fei Huang
Huajun Chen
KELMCLL
312
61
0
23 May 2024
Automatically Identifying Local and Global Circuits with Linear
  Computation Graphs
Automatically Identifying Local and Global Circuits with Linear Computation Graphs
Xuyang Ge
Fukang Zhu
Wentao Shu
Junxuan Wang
Zhengfu He
Xipeng Qiu
256
18
0
22 May 2024
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Rheeya Uppaal
Apratim De
Yiting He
Yiquao Zhong
Junjie Hu
592
7
0
22 May 2024
Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited
  Facts
Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited Facts
Baolong Bi
Shenghua Liu
Lingrui Mei
Yiwei Wang
Pengliang Ji
Xueqi Cheng
KELM
289
43
0
19 May 2024
BadActs: A Universal Backdoor Defense in the Activation Space
BadActs: A Universal Backdoor Defense in the Activation SpaceAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Biao Yi
Sishuo Chen
Yiming Li
Tong Li
Baolei Zhang
Zheli Liu
AAML
181
20
0
18 May 2024
Learnable Privacy Neurons Localization in Language Models
Learnable Privacy Neurons Localization in Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruizhe Chen
Tianxiang Hu
Yang Feng
Zuo-Qiang Liu
220
28
0
16 May 2024
Large Language Model Bias Mitigation from the Perspective of Knowledge
  Editing
Large Language Model Bias Mitigation from the Perspective of Knowledge Editing
Ruizhe Chen
Yichen Li
Zikai Xiao
Zuo-Qiang Liu
KELM
334
18
0
15 May 2024
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language Models
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2024
Anna A. Ivanova
Aalok Sathe
Benjamin Lipkin
Unnathi Kumar
S. Radkani
...
Leshem Choshen
Roger Levy
Evelina Fedorenko
Josh Tenenbaum
Jacob Andreas
311
56
0
15 May 2024
Towards Principled Evaluations of Sparse Autoencoders for
  Interpretability and Control
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Aleksandar Makelov
Georg Lange
Neel Nanda
359
61
0
14 May 2024
Can Language Models Explain Their Own Classification Behavior?
Can Language Models Explain Their Own Classification Behavior?
Dane Sherburn
Bilal Chughtai
Owain Evans
214
2
0
13 May 2024
Erasing Concepts from Text-to-Image Diffusion Models with Few-shot
  Unlearning
Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning
Masane Fuchi
Tomohiro Takagi
DiffMVLM
266
25
0
12 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
415
228
0
09 May 2024
Learned feature representations are biased by complexity, learning
  order, position, and more
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CEFaMLSSLOOD
275
20
0
09 May 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao Song
413
1
0
09 May 2024
A Causal Explainable Guardrails for Large Language Models
A Causal Explainable Guardrails for Large Language Models
Zhixuan Chu
Yan Wang
Longfei Li
Peng Kuang
Zhan Qin
Kui Ren
LLMSV
192
13
0
07 May 2024
How does GPT-2 Predict Acronyms? Extracting and Understanding a Circuit
  via Mechanistic Interpretability
How does GPT-2 Predict Acronyms? Extracting and Understanding a Circuit via Mechanistic Interpretability
Jorge García-Carrasco
Alejandro Maté
Juan Trujillo
205
12
0
07 May 2024
FlashBack:Efficient Retrieval-Augmented Language Modeling for Long Context Inference
FlashBack:Efficient Retrieval-Augmented Language Modeling for Long Context Inference
Runheng Liu
Xingchen Xiao
Heyan Huang
Zewen Chi
Zhijing Wu
RALMKELM
353
1
0
07 May 2024
A Philosophical Introduction to Language Models - Part II: The Way
  Forward
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
282
24
0
06 May 2024
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning
  in Large Language Models
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language ModelsInternational Conference on Machine Learning (ICML), 2024
George-Octavian Barbulescu
Peter Triantafillou
MU
359
33
0
06 May 2024
Compressing Long Context for Enhancing RAG with AMR-based Concept
  Distillation
Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation
Kaize Shi
Xueyao Sun
Qing Li
Guandong Xu
301
22
0
06 May 2024
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice QuestionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Ruizhe Li
Yanjun Gao
KELM
341
13
0
06 May 2024
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Qizhou Chen
Taolin Zhang
Xiaofeng He
Dongyang Li
Chengyu Wang
Longtao Huang
Hui Xue
CLLKELM
384
33
0
06 May 2024
What does the Knowledge Neuron Thesis Have to do with Knowledge?
What does the Knowledge Neuron Thesis Have to do with Knowledge?International Conference on Learning Representations (ICLR), 2024
Jingcheng Niu
Andrew Liu
Zining Zhu
Gerald Penn
337
47
0
03 May 2024
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model
  Editing with Llama-3
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3
Junsang Yoon
Akshat Gupta
Gopala Anumanchipalli
141
9
0
01 May 2024
KAN: Kolmogorov-Arnold Networks
KAN: Kolmogorov-Arnold Networks
Ziming Liu
Yixuan Wang
Sachin Vaidya
Fabian Ruehle
James Halverson
Marin Soljacic
Thomas Y. Hou
Max Tegmark
986
1,261
0
30 Apr 2024
Revealing the Parametric Knowledge of Language Models: A Unified
  Framework for Attribution Methods
Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods
Haeun Yu
Pepa Atanasova
Isabelle Augenstein
KELM
219
10
0
29 Apr 2024
Previous
123...181920...262728
Next
Page 19 of 28
Pageof 28