Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.12497
Cited By
NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning
17 December 2024
Xin Yi
Shunfan Zheng
Linlin Wang
Gerard de Melo
Xiaoling Wang
Liang He
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning"
2 / 2 papers shown
Title
Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety
Zihan Guan
Mengxuan Hu
Ronghang Zhu
Sheng Li
Anil Vullikanti
AAML
31
0
0
11 May 2025
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
Yishuo Wang
Tiansheng Huang
Li Shen
H. Yao
Haotian Luo
Rui Liu
Naiqiang Tan
Jiaxing Huang
Dacheng Tao
AAML
MoMe
CLL
111
2
0
30 Jan 2025
1