ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.01814
  4. Cited By
Large Language Models Relearn Removed Concepts

Large Language Models Relearn Removed Concepts

3 January 2024
Michelle Lo
Shay B. Cohen
Fazl Barez
    KELM
ArXivPDFHTML

Papers citing "Large Language Models Relearn Removed Concepts"

11 / 11 papers shown
Title
Representation Bending for Large Language Model Safety
Representation Bending for Large Language Model Safety
Ashkan Yousefpour
Taeheon Kim
Ryan S. Kwon
Seungbeen Lee
Wonje Jeung
Seungju Han
Alvin Wan
Harrison Ngan
Youngjae Yu
Jonghyun Choi
AAML
ALM
KELM
52
0
0
02 Apr 2025
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object Identification
Vishnu Kabir Chhabra
Ding Zhu
Mohammad Mahdi Khalili
37
2
0
27 Feb 2025
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities
Zora Che
Stephen Casper
Robert Kirk
Anirudh Satheesh
Stewart Slocum
...
Zikui Cai
Bilal Chughtai
Y. Gal
Furong Huang
Dylan Hadfield-Menell
MU
AAML
ELM
83
3
0
03 Feb 2025
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via
  Mechanistic Localization
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
Phillip Guo
Aaquib Syed
Abhay Sheshadri
Aidan Ewart
Gintare Karolina Dziugaite
KELM
MU
31
5
0
16 Oct 2024
OD-Stega: LLM-Based Near-Imperceptible Steganography via Optimized
  Distributions
OD-Stega: LLM-Based Near-Imperceptible Steganography via Optimized Distributions
Yu-Shin Huang
Peter Just
Krishna Narayanan
Chao Tian
32
1
0
06 Oct 2024
Detoxifying Large Language Models via Knowledge Editing
Detoxifying Large Language Models via Knowledge Editing
Meng Wang
Ningyu Zhang
Ziwen Xu
Zekun Xi
Shumin Deng
Yunzhi Yao
Qishen Zhang
Linyi Yang
Jindong Wang
Huajun Chen
KELM
38
54
0
21 Mar 2024
Editing Conceptual Knowledge for Large Language Models
Editing Conceptual Knowledge for Large Language Models
Xiaohan Wang
Shengyu Mao
Ningyu Zhang
Shumin Deng
Yunzhi Yao
Yue Shen
Lei Liang
Jinjie Gu
Huajun Chen
KELM
27
13
0
10 Mar 2024
Eight Methods to Evaluate Robust Unlearning in LLMs
Eight Methods to Evaluate Robust Unlearning in LLMs
Aengus Lynch
Phillip Guo
Aidan Ewart
Stephen Casper
Dylan Hadfield-Menell
ELM
MU
35
56
0
26 Feb 2024
InstructEdit: Instruction-based Knowledge Editing for Large Language
  Models
InstructEdit: Instruction-based Knowledge Editing for Large Language Models
Ningyu Zhang
Bo Tian
Siyuan Cheng
Xiaozhuan Liang
Yi Hu
Kouying Xue
Yanjie Gou
Xi Chen
Huajun Chen
KELM
40
4
0
25 Feb 2024
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Deniz Bayazit
Negar Foroutan
Zeming Chen
Gail Weiss
Antoine Bosselut
KELM
16
13
0
04 Oct 2023
Similarity Analysis of Contextual Word Representation Models
Similarity Analysis of Contextual Word Representation Models
John M. Wu
Yonatan Belinkov
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
James R. Glass
46
73
0
03 May 2020
1