How Language Models Prioritize Contextual Grammatical Cues?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024

Hamidreza Amirzadeh

Afra Alishahi

Hosein Mohebbi

181

04 Oct 2024

RIPPLECOT: Amplifying Ripple Effect of Knowledge Editing in Language Models via Chain-of-Thought In-Context LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Yuchen Yang

173

04 Oct 2024

SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

343

04 Oct 2024

Fine-Tuning Language Models with Differential Privacy through Adaptive Noise AllocationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Zhiqiang Ma

197

03 Oct 2024

HiddenGuard: Fine-Grained Safe Generation with Specialized Representation Router

Lingrui Mei

Shenghua Liu

Yiwei Wang

Baolong Bi

Ruibin Yuan

Xueqi Cheng

254

03 Oct 2024

Defining Knowledge: Bridging Epistemology and Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

335

03 Oct 2024

Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language

Anthony Costarelli

Mat Allen

Severin Field

274

03 Oct 2024

Better Call SAUL: Fluent and Consistent Language Model Editing with Generation RegularizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Heike Adel

Hinrich Schütze

203

03 Oct 2024

Mitigating Memorization In Language Models

Arham Khan

Kyle Chard

Ian Foster

Michael W. Mahoney

KELM MU

331

03 Oct 2024

FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs

370

03 Oct 2024

Erasing Conceptual Knowledge from Language Models

443

03 Oct 2024

Interpreting and Editing Vision-Language Representations to Mitigate HallucinationsInternational Conference on Learning Representations (ICLR), 2024

412

03 Oct 2024

LLMs Know More Than They Show: On the Intrinsic Representation of LLM HallucinationsInternational Conference on Learning Representations (ICLR), 2024

697

114

03 Oct 2024

Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and OptimizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

434

03 Oct 2024

AlphaEdit: Null-Space Constrained Knowledge Editing for Language ModelsInternational Conference on Learning Representations (ICLR), 2024

Cunchun Li

Houcheng Jiang

Kun Wang

Yunshan Ma

Shi Jie

Xiangnan He

Tat-Seng Chua

Tat-seng Chua

KELM

524

135

03 Oct 2024

Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024

Yi Zeng

336

03 Oct 2024

Question-guided Knowledge Graph Re-scoring and Injection for Knowledge Graph Question AnsweringConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Yu Zhang

Kehai Chen

Xuefeng Bai

zhao kang

Quanjiang Guo

Min Zhang

301

02 Oct 2024

Mitigating Copy Bias in In-Context Learning through Neuron Pruning

Ameen Ali

Lior Wolf

Ivan Titov

195

02 Oct 2024

Circuit Compositions: Exploring Modular Structures in Transformer-Based Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Philipp Mondorf

Sondre Wold

Yun Xue

492

02 Oct 2024

Skill Path: Unveiling Language Skills from Circuit Graphs

160

02 Oct 2024

Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge AcquisitionInternational Conference on Learning Representations (ICLR), 2024

Minjoon Seo

1.0K

02 Oct 2024

Do Music Generation Models Encode Music Theory?International Society for Music Information Retrieval Conference (ISMIR), 2024

193

01 Oct 2024

Quantifying reliance on external information over parametric knowledge during Retrieval Augmented Generation (RAG) using mechanistic analysis

Reshmi Ghosh

Rahul Seetharaman

Hitesh Wadhwa

Somyaa Aggarwal

Samyadeep Basu

Soundararajan Srinivasan

Wenlong Zhao

Shreyas Chaudhari

Ehsan Aghazadeh

103

01 Oct 2024

UniAdapt: A Universal Adapter for Knowledge Calibration

Tai D. Nguyen

Long H. Pham

Jun Sun

KELM

172

01 Oct 2024

Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge UnlearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

119

01 Oct 2024

Towards Unified Multimodal Editing with Enhanced Knowledge CollaborationNeural Information Processing Systems (NeurIPS), 2024

Juncheng Li

Hao Fei

Hanwang Zhang

344

30 Sep 2024

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian DistributionInternational Conference on Learning Representations (ICLR), 2024

425

30 Sep 2024

Transforming Hidden States into Binary Semantic Features

Tomáš Musil

David Marecek

OffRL

134

29 Sep 2024

Unified Gradient-Based Machine Unlearning with Remain Geometry EnhancementNeural Information Processing Systems (NeurIPS), 2024

247

29 Sep 2024

Identifying Knowledge Editing Types in Large Language Models

335

29 Sep 2024

Crafting Personalized Agents through Retrieval-Augmented Generation on Editable Memory GraphsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Zheng Wang

Zhongyang Li

Zeren Jiang

Dandan Tu

Wei Shi

218

28 Sep 2024

Localizing Memorization in SSL Vision EncodersNeural Information Processing Systems (NeurIPS), 2024

259

27 Sep 2024

"Why" Has the Least Side Effect on Model Editing

Tsung-Hsuan Pan

Chung-Chi Chen

Hen-Hsen Huang

Hsin-Hsi Chen

KELM

128

27 Sep 2024

SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated LearningIEEE Transactions on Dependable and Secure Computing (IEEE TDSC), 2024

275

23 Sep 2024

Investigating Layer Importance in Large Language ModelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024

244

22 Sep 2024

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

541

22 Sep 2024

Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Zeping Yu

Sophia Ananiadou

LRM MILM

283

21 Sep 2024

Uncovering Latent Chain of Thought Vectors in Language Models

Jason Zhang

Scott Viteri

LLMSV LRM

458

21 Sep 2024

Co-occurrence is not Factual Association in Language ModelsNeural Information Processing Systems (NeurIPS), 2024

413

21 Sep 2024

Towards LifeSpan Cognitive Systems

Yu Wang

...

Wei Wang

Heng Ji

Julian McAuley

KELM CLL

995

20 Sep 2024

LLM Surgery: Efficient Knowledge Unlearning and Editing in Large Language Models

Akshaj Kumar Veldanda

260

19 Sep 2024

Pay Attention to What Matters

154

19 Sep 2024

MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic LanguageInternational Conference on Computational Linguistics (COLING), 2024

Di Wang

273

18 Sep 2024

StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models

Shenghua Liu

Lingrui Mei

215

16 Sep 2024

Householder Pseudo-Rotation: A Novel Approach to Activation Editing in LLMs with Direction-Magnitude PerspectiveConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Van-Cuong Pham

Thien Huu Nguyen

LLMSV

222

16 Sep 2024