Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2508.17158
Cited By
Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks
23 August 2025
Jack Youstra
Mohammed Mahfoud
Yang Yan
Henry Sleight
Ethan Perez
Mrinank Sharma
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Github (10363★)
Papers citing
"Towards Safeguarding LLM Fine-tuning APIs against Cipher Attacks"
2 / 2 papers shown
Title
Detecting Adversarial Fine-tuning with Auditing Agents
Sarah Egler
John Schulman
Nicholas Carlini
AAML
MLAU
145
0
0
17 Oct 2025
All Code, No Thought: Current Language Models Struggle to Reason in Ciphered Language
Shiyuan Guo
Henry Sleight
Fabien Roger
ELM
LRM
133
0
0
10 Oct 2025
1