Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2509.15202
Cited By
Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction
18 September 2025
Yuanbo Xie
Yingjie Zhang
Tianyun Liu
Duohe Ma
Tingwen Liu
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3★)
Papers citing
"Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction"
1 / 1 papers shown
Read the Scene, Not the Script: Outcome-Aware Safety for LLMs
Rui Wu
Yihao Quan
Zeru Shi
Zhenting Wang
Yanshu Li
Ruixiang Tang
130
0
0
05 Oct 2025
1