Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Home
Papers
2508.14904
Cited By
v1
v2 (latest)
Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training
12 August 2025
Jianfeng Si
Lin Sun
Zhewen Tan
Xiangzheng Zhang
MU
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Github
Papers citing
"Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training"
1 / 1 papers shown
Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment
Nevan Wichers
Aram Ebtekar
Ariana Azarbal
Victor Gillioz
Christine Ye
...
Neil Rathi
Henry Sleight
Alex Mallen
Fabien Roger
Samuel Marks
336
3
0
06 Oct 2025
1