v1v2v3v4v5 (latest)

Locating and Editing Factual Associations in GPT

Neural Information Processing Systems (NeurIPS), 2022

10 February 2022

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Locating and Editing Factual Associations in GPT"

50 / 1,361 papers shown

Healing Powers of BERT: How Task-Specific Fine-Tuning Recovers Corrupted Language Models

Shijie Han

Zhenyu Zhang

Andrei Arsene Simion

171

20 Jun 2024

Locating and Extracting Relational Concepts in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

211

19 Jun 2024

From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Hitesh Wadhwa

Rahul Seetharaman

Somyaa Aggarwal

Reshmi Ghosh

Samyadeep Basu

Soundararajan Srinivasan

157

18 Jun 2024

Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop QueriesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

256

18 Jun 2024

Estimating Knowledge in Large Language Models Without Generating a Single TokenConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Daniela Gottesman

Mor Geva

269

18 Jun 2024

From Insights to Actions: The Impact of Interpretability and Analysis Research on NLPConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

245

18 Jun 2024

Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities

Shenghua Liu

Lingrui Mei

190

18 Jun 2024

Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding

Bo Bai

Wei Han

246

18 Jun 2024

An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs

Daking Rai

Ziyu Yao

LRM

284

18 Jun 2024

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

211

18 Jun 2024

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning

Shu Yang

Di Wang

317

18 Jun 2024

Opt-Out: Investigating Entity-Level Unlearning for Large Language Models via Optimal Transport

264

18 Jun 2024

InternalInspector

I^2

: Robust Confidence Estimation in LLMs through Internal States

Ming Jin

Lifu Huang

253

17 Jun 2024

Soft Prompting for Unlearning in Large Language Models

276

17 Jun 2024

Language Modeling with Editable External Knowledge

Jacob Andreas

264

17 Jun 2024

Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations

267

17 Jun 2024

MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation

Kang Liu

Jun Zhao

KELM

256

17 Jun 2024

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Wenjie Wang

246

17 Jun 2024

A Complete Survey on LLM-based AI Chatbots

Sumit Kumar Dam

Choong Seon Hong

Yu Qiao

Chaoning Zhang

288

131

17 Jun 2024

Self-training Large Language Models through Knowledge Detection

Wei Jie Yeo

Teddy Ferdinan

Przemyslaw Kazienko

Frank Xing

Erik Cambria

241

17 Jun 2024

The Fall of ROME: Understanding the Collapse of LLMs in Model Editing

Fei Sun

133

17 Jun 2024

SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and Lexical Alterations

Sri Harsha Dumpala

Aman Jaiswal

Chandramouli Shama Sastry

407

17 Jun 2024

Intrinsic Test of Unlearning Using Parametric Knowledge Traces

350

17 Jun 2024

In-Context Editing: Learning Knowledge from Self-Induced Distributions

575

17 Jun 2024

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

575

17 Jun 2024

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Kang Liu

Jun Zhao

KELM MU

339

16 Jun 2024

Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals

Xintao Wang

Yanghua Xiao

Bing Han

Wei Wang

212

16 Jun 2024

RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning

Ruirui Li

Jing Gao

282

16 Jun 2024

DIEKAE: Difference Injection for Efficient Knowledge Augmentation and Editing of Large Language Models

Alessio Galatolo

Meriem Beloucif

Katie Winkle

163

15 Jun 2024

Knowledge Editing in Language Models via Adapted Direct Preference OptimizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Lior Wolf

188

14 Jun 2024

REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space

666

13 Jun 2024

Research Trends for the Interplay between Large Language Models and Knowledge Graphs

Nandana Mihindukulasooriya

409

12 Jun 2024

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

Feiran Huang

Xiao Huang

853

144

12 Jun 2024

Towards Lifelong Learning of Large Language Models: A Survey

Qianli Ma

286

10 Jun 2024

The Curse of Popularity: Popular Entities have Catastrophic Side Effects
when Deleting Knowledge from Language Models

Keisuke Sakaguchi

181

10 Jun 2024

MEFT: Memory-Efficient Fine-Tuning through Sparse AdapterAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Zhumin Chen

190

07 Jun 2024

Time Sensitive Knowledge Editing through Efficient FinetuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Kun Qian

Yunyao Li

346

06 Jun 2024

Improving Alignment and Robustness with Circuit BreakersNeural Information Processing Systems (NeurIPS), 2024

Maksym Andriushchenko

624

210

06 Jun 2024

Understanding Information Storage and Transfer in Multi-modal Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024

Samyadeep Basu

299

06 Jun 2024

Memorization in deep learning: A survey

Jiaheng Wei

Yanjun Zhang

Leo Yu Zhang

Yang Xiang

303

06 Jun 2024

Interpreting the Second-Order Effects of Neurons in CLIP

450

06 Jun 2024

Outdated Issue Aware Decoding for Reasoning Questions on Edited Knowledge

Zengkui Sun

Yijin Liu

Jiaan Wang

Fandong Meng

Jinan Xu

Jie Zhou

KELM

209

05 Jun 2024

Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

229

05 Jun 2024

Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller

Ziniu Hu

319

04 Jun 2024

LoFiT: Localized Fine-tuning on LLM Representations

Fangcong Yin

Xi Ye

Greg Durrett

268

03 Jun 2024

Decoupled Alignment for Robust Plug-and-Play Adaptation

Jerry Yao-Chieh Hu

390

03 Jun 2024

Understanding Token Probability Encoding in Output Embeddings

296

03 Jun 2024

Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models

230

03 Jun 2024

From Feature Visualization to Visual Circuits: Effect of Adversarial Model Manipulation

311

03 Jun 2024

Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience

Martina G. Vilas

Federico Adolfi

David Poeppel

Gemma Roig

312

03 Jun 2024