Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2509.15202
Cited By

Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction

Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction

18 September 2025

ArXiv (abs)PDF HTML Github (3★)

Papers citing "Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction"

1 / 1 papers shown

Read the Scene, Not the Script: Outcome-Aware Safety for LLMs

Read the Scene, Not the Script: Outcome-Aware Safety for LLMs

130

0

0

05 Oct 2025