Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2507.18631
Cited By
v1
v2 (latest)
Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
24 July 2025
Hao Li
Lijun Li
Zhenghao Lu
Xianyi Wei
Rui Li
Jing Shao
Lei Sha
Re-assign community
ArXiv (abs)
PDF
HTML
Github (8★)
Papers citing
"Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment"
5 / 5 papers shown
Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities
Yuan Xiong
Ziqi Miao
Lijun Li
Chen Qian
Jie Li
Jing Shao
AAML
267
0
0
02 Dec 2025
HarmRLVR: Weaponizing Verifiable Rewards for Harmful LLM Alignment
Y. Liu
Lijun Li
X. Wang
Jing Shao
LLMSV
255
0
0
17 Oct 2025
LLMs Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions
Xuhao Hu
Peng Wang
Xiaoya Lu
Dongrui Liu
Xuanjing Huang
Jing Shao
147
1
0
09 Oct 2025
Weak Form Learning for Mean-Field Partial Differential Equations: an Application to Insect Movement
Seth Minor
Bret D. Elderd
Benjamin Van Allen
David M. Bortz
Vanja M. Dukic
143
0
0
09 Oct 2025
Detecting and Filtering Unsafe Training Data via Data Attribution with Denoised Representation
Yijun Pan
Taiwei Shi
Jieyu Zhao
Jiaqi W. Ma
TDI
150
8
0
17 Feb 2025
1
Page 1 of 1