Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2501.08145
Cited By

Refusal Behavior in Large Language Models: A Nonlinear Perspective

Refusal Behavior in Large Language Models: A Nonlinear Perspective

14 January 2025

Fabian Hildebrandt

Andreas K. Maier

ArXiv (abs)PDF HTML Github

Papers citing "Refusal Behavior in Large Language Models: A Nonlinear Perspective"

6 / 6 papers shown

AlignTree: Efficient Defense Against LLM Jailbreak Attacks

AlignTree: Efficient Defense Against LLM Jailbreak Attacks

249

2

0

15 Nov 2025

Poison Once, Refuse Forever: Weaponizing Alignment for Injecting Bias in LLMs

Poison Once, Refuse Forever: Weaponizing Alignment for Injecting Bias in LLMs

Md Abdullah Al Mamun

Nael B. Abu-Ghazaleh

128

1

0

28 Aug 2025

The Geometry of Harmfulness in LLMs through Subconcept Probing

The Geometry of Harmfulness in LLMs through Subconcept Probing

Saleena Angeline

Adhitya Rajendra Kumar

315

4

0

23 Jul 2025

Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety

Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety

Duen Horng Chau

400

6

0

05 Jun 2025

From Rogue to Safe AI: The Role of Explicit Refusals in Aligning LLMs with International Humanitarian Law

From Rogue to Safe AI: The Role of Explicit Refusals in Aligning LLMs with International Humanitarian Law

Diana Teodora Găitan

Sergio Coronado

235

0

0

05 Jun 2025

From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

251

2

0

27 May 2025

Page 1 of 1