ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.15202
  4. Cited By
Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction

Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction

18 September 2025
Yuanbo Xie
Yingjie Zhang
Tianyun Liu
Duohe Ma
Tingwen Liu
    AAML
ArXiv (abs)PDFHTMLGithub (3★)

Papers citing "Beyond Surface Alignment: Rebuilding LLMs Safety Mechanism via Probabilistically Ablating Refusal Direction"

1 / 1 papers shown
Read the Scene, Not the Script: Outcome-Aware Safety for LLMs
Read the Scene, Not the Script: Outcome-Aware Safety for LLMs
Rui Wu
Yihao Quan
Zeru Shi
Zhenting Wang
Yanshu Li
Ruixiang Tang
130
0
0
05 Oct 2025
1