ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.14904
  4. Cited By
Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training
v1v2 (latest)

Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training

12 August 2025
Jianfeng Si
Lin Sun
Zhewen Tan
Xiangzheng Zhang
    MU
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github

Papers citing "Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training"

1 / 1 papers shown
Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment
Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment
Nevan Wichers
Aram Ebtekar
Ariana Azarbal
Victor Gillioz
Christine Ye
...
Neil Rathi
Henry Sleight
Alex Mallen
Fabien Roger
Samuel Marks
336
3
0
06 Oct 2025
1