v1v2 (latest)

Knowledge Neurons in Pretrained Transformers

Annual Meeting of the Association for Computational Linguistics (ACL), 2021

18 April 2021

Damai Dai

Li Dong

Y. Hao

Zhifang Sui

Baobao Chang

Furu Wei

KELM

ArXiv (abs)PDF HTML Github (168★)

Papers citing "Knowledge Neurons in Pretrained Transformers"

50 / 410 papers shown

Revealing and Mitigating Over-Attention in Knowledge EditingInternational Conference on Learning Representations (ICLR), 2025

576

21 Feb 2025

MLaKE: Multilingual Knowledge Editing Benchmark for Large Language ModelsInternational Conference on Computational Linguistics (COLING), 2024

268

20 Feb 2025

UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning

393

20 Feb 2025

PASER: Post-Training Data Selection for Efficient Pruned Large Language Model Recovery

412

18 Feb 2025

Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models

...

Ning Qiang

Bao Ge

Tianming Liu

Junwei Han

Xintao Hu

164

13 Feb 2025

Reinforced Lifelong Editing for Language Models

613

09 Feb 2025

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering

259

05 Feb 2025

Discovering Chunks in Neural Embeddings for Interpretability

291

03 Feb 2025

Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing

Zeping Yu

Sophia Ananiadou

KELM

290

24 Jan 2025

Episodic Memories Generation and Evaluation Benchmark for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2025

221

21 Jan 2025

Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning

309

12 Jan 2025

Multi-Task Model Merging via Adaptive Weight Disentanglement

582

10 Jan 2025

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

335

08 Jan 2025

266

06 Jan 2025

How Do Artificial Intelligences Think? The Three Mathematico-Cognitive Factors of Categorical Segmentation Operated by Synthetic Neurons

Michael Veillet-Guillem

276

26 Dec 2024

Joint Knowledge Editing for Information Enrichment and Probability PromotionAAAI Conference on Artificial Intelligence (AAAI), 2024

210

22 Dec 2024

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research

A. Feder Cooper

Christopher A. Choquette-Choo

...

352

09 Dec 2024

Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey

...

425

03 Dec 2024

Continuous Concepts Removal in Text-to-image Diffusion Models

534

30 Nov 2024

One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models

310

26 Nov 2024

Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-ExpertsComputer Vision and Pattern Recognition (CVPR), 2024

378

23 Nov 2024

Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models

250

19 Nov 2024

Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering

Zeping Yu

Sophia Ananiadou

1.1K

17 Nov 2024

Information Anxiety in Large Language Models

Prasoon Bajpai

Sarah Masud

Tanmoy Chakraborty

154

16 Nov 2024

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant DeploymentNeural Information Processing Systems (NeurIPS), 2024

285

15 Nov 2024

Controllable Context Sensitivity and the Knob Behind ItInternational Conference on Learning Representations (ICLR), 2024

625

11 Nov 2024

Learning Where to Edit Vision TransformersNeural Information Processing Systems (NeurIPS), 2024

231

04 Nov 2024

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

585

04 Nov 2024

Commonsense Knowledge Editing Based on Free-Text in LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

182

31 Oct 2024

Reasons and Solutions for the Decline in Model Performance after EditingNeural Information Processing Systems (NeurIPS), 2024

263

31 Oct 2024

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Junxuan Wang

...

Qipeng Guo

Xuanjing Huang

Zuxuan Wu

Yu-Gang Jiang

Xipeng Qiu

328

27 Oct 2024

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

284

24 Oct 2024

The Tug of War Within: Mitigating the Fairness-Privacy Conflicts in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

344

22 Oct 2024

Catastrophic Failure of LLM Unlearning via QuantizationInternational Conference on Learning Representations (ICLR), 2024

Zhiwei Zhang

Fali Wang

Xiaomin Li

Zongyu Wu

Xianfeng Tang

Hui Liu

Qi He

Wenpeng Yin

Suhang Wang

330

21 Oct 2024

Neuron-based Personality Trait Induction in Large Language Models

240

16 Oct 2024

Cross-Modal Safety Mechanism Transfer in Large Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2024

297

16 Oct 2024

ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic InterpretabilityInternational Conference on Learning Representations (ICLR), 2024

Yang Song

310

15 Oct 2024

LargePiG: Your Large Language Model is Secretly a Pointer Generator

227

15 Oct 2024

MoIN: Mixture of Introvert Experts to Upcycle an LLM

337

13 Oct 2024

ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple DomainsInternational Conference on Learning Representations (ICLR), 2024

478

13 Oct 2024

Keys to Robust Edits: from Theoretical Insights to Practical AdvancesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

273

12 Oct 2024

Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models

Sitao Cheng

Liangming Pan

Xunjian Yin

Xinyi Wang

William Yang Wang

KELM

237

10 Oct 2024

Uncovering Overfitting in Large Language Model EditingInternational Conference on Learning Representations (ICLR), 2024

285

10 Oct 2024

From Tokens to Words: On the Inner Lexicon of LLMsInternational Conference on Learning Representations (ICLR), 2024

Guy Kaplan

Matanel Oren

Yuval Reif

Roy Schwartz

441

08 Oct 2024

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

Lijie Hu

Di Wang

KELM

408

08 Oct 2024

MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models

Kun Wang

Xuming Hu

244

07 Oct 2024

Neuron-Level Sequential Editing for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Houcheng Jiang

Xiang Wang

233

05 Oct 2024

Mitigating Memorization In Language Models

Arham Khan

Kyle Chard

Ian Foster

Michael W. Mahoney

KELM MU

393

03 Oct 2024

Mitigating Copy Bias in In-Context Learning through Neuron Pruning

Ameen Ali

Lior Wolf

Ivan Titov

199

02 Oct 2024

Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge AcquisitionInternational Conference on Learning Representations (ICLR), 2024

Minjoon Seo

1.0K

02 Oct 2024