v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022

10 February 2022

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,361 papers shown

REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model

111

26 Sep 2025

Fine-tuning Done Right in Model Editing

183

26 Sep 2025

Bilinear relational structure fixes reversal curse and enables consistent model editing

377

26 Sep 2025

MindCraft: How Concept Trees Take Shape In Deep Models

108

26 Sep 2025

Towards Transparent AI: A Survey on Explainable Language Models

Avash Palikhe

Sribala Vidyadhari Chinta

178

25 Sep 2025

Towards Atoms of Large Language Models

122

25 Sep 2025

Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models

Sasha Cui

Zhongren Chen

LLMSV

238

25 Sep 2025

CLUE: Conflict-guided Localization for LLM Unlearning Framework

143

25 Sep 2025

Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing

153

24 Sep 2025

Personality Vector: Modulating Personality of Large Language Models by Model Merging

121

24 Sep 2025

bi-GRPO: Bidirectional Optimization for Jailbreak Backdoor Injection on LLMs

150

24 Sep 2025

Latent Activation Editing: Inference-Time Refinement of Learned Policies for Safer Multirobot Navigation

182

24 Sep 2025

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

160

23 Sep 2025

Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning

162

23 Sep 2025

Cyclic Ablation: Testing Concept Localization against Functional Regeneration in AI

Eduard Kapelko

23 Sep 2025

When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models

249

23 Sep 2025

Consistency-Aware Parameter-Preserving Knowledge Editing Framework for Multi-Hop Question Answering

185

23 Sep 2025

Memory in Large Language Models: Mechanisms, Evaluation and Evolution

217

23 Sep 2025

How Persuasive is Your Context?

Tu Nguyen

Kevin Du

Alexander Miserlis Hoyle

Ryan Cotterell

113

22 Sep 2025

Diagnosing Model Editing via Knowledge Spectrum

117

22 Sep 2025

Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data

172

22 Sep 2025

DISCO: Disentangled Communication Steering for Large Language Models

182

20 Sep 2025

ConceptViz: A Visual Analytics Approach for Exploring Concepts in Large Language Models

156

20 Sep 2025

Sparse-Autoencoder-Guided Internal Representation Unlearning for Large Language Models

118

19 Sep 2025

Toward Efficient Influence Function: Dropout as a Compression Tool

Yuchen Zhang

Mohammad Mohammadi Amiri

TDI

243

19 Sep 2025

Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets

19 Sep 2025

Reveal and Release: Iterative LLM Unlearning with Self-generated Data

166

18 Sep 2025

Digging Into the Internal: Causality-Based Analysis of LLM Function Calling

18 Sep 2025

V-SEAM: Visual Semantic Editing and Attention Modulating for Causal Interpretability of Vision-Language Models

Qidong Wang

Junjie Hu

Ming Jiang

104

18 Sep 2025

Real, Fake, or Manipulated? Detecting Machine-Influenced Text

240

18 Sep 2025

Sparse Neurons Carry Strong Signals of Question Ambiguity in LLMs

112

17 Sep 2025

Do Natural Language Descriptions of Model Activations Convey Privileged Information?

Millicent Li

Alberto Mario Ceballos Arroyo

Giordano Rogers

Naomi Saphra

Byron C. Wallace

181

16 Sep 2025

Collapse of Irrelevant Representations (CIR) Ensures Robust and Non-Disruptive LLM Unlearning

Filip Sondej

Yushi Yang

381

15 Sep 2025

Quantifying Compositionality of Classic and State-of-the-Art Embeddings

Janet B. Pierrehumbert

Martha Lewis

CoGe

169

14 Sep 2025

Pathological Truth Bias in Vision-Language Models

Yash Thube

14 Sep 2025

Context Copying Modulation: The Role of Entropy Neurons in Managing Parametric and Contextual Knowledge Conflicts

229

12 Sep 2025

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens

130

11 Sep 2025

SEDM: Scalable Self-Evolving Distributed Memory for Agents

183

11 Sep 2025

Do All Autoregressive Transformers Remember Facts the Same Way? A Cross-Architecture Analysis of Recall Mechanisms

150

10 Sep 2025

Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition

113

09 Sep 2025

Statistical Methods in Generative AI

Edgar Dobriban

289

08 Sep 2025

Towards Meta-Cognitive Knowledge Editing for Multimodal LLMs

158

06 Sep 2025

$Memorization $\neq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?$

Memorization

\neq

Understanding: Do Large Language Models Have the Ability of Scenario Cognition?

137

05 Sep 2025

Manipulating Transformer-Based Models: Controllability, Steerability, and Robust Interventions

Faruk Alpay

Taylan Alpay

LM&Ro

04 Sep 2025

Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts

162

02 Sep 2025

Unlearning That Lasts: Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs

192

02 Sep 2025

Vis-CoT: A Human-in-the-Loop Framework for Interactive Visualization and Intervention in LLM Chain-of-Thought Reasoning

113

01 Sep 2025

Robust Knowledge Editing via Explicit Reasoning Chains for Distractor-Resilient Multi-Hop QA

116

01 Sep 2025

Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

184

01 Sep 2025

Causal Consistency Regularization: Training Verifiably Sensitive Reasoning in Large Language Models

158

01 Sep 2025