Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.18841
Cited By
Trading Inference-Time Compute for Adversarial Robustness
31 January 2025
Wojciech Zaremba
Evgenia Nitishinskaya
Boaz Barak
Stephanie Lin
Sam Toyer
Yaodong Yu
Rachel Dias
Eric Wallace
Kai Y. Xiao
Johannes Heidecke
Amelia Glaese
LRM
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Trading Inference-Time Compute for Adversarial Robustness"
10 / 10 papers shown
Title
A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning
Greg Gluch
Shafi Goldwasser
AAML
37
0
0
28 Apr 2025
Safety in Large Reasoning Models: A Survey
Cheng Wang
Yue Liu
B. Li
Duzhen Zhang
Z. Li
Junfeng Fang
Bryan Hooi
LRM
151
1
0
24 Apr 2025
Heimdall: test-time scaling on the generative verification
Wenlei Shi
Xing Jin
LRM
24
0
0
14 Apr 2025
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao
Jiahe Guo
Yulin Hu
Yang Deng
An Zhang
...
Xinyang Han
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
AAML
LLMSV
43
0
0
13 Apr 2025
Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks
Hanjiang Hu
Alexander Robey
Changliu Liu
AAML
LLMSV
47
1
0
28 Feb 2025
Forecasting Rare Language Model Behaviors
Erik Jones
Meg Tong
Jesse Mu
Mohammed Mahfoud
Jan Leike
Roger C. Grosse
Jared Kaplan
William Fithian
Ethan Perez
Mrinank Sharma
47
2
0
24 Feb 2025
Jailbreaking to Jailbreak
Jeremy Kritz
Vaughn Robinson
Robert Vacareanu
Bijan Varjavand
Michael Choi
Bobby Gogov
Scale Red Team
Summer Yue
Willow Primack
Zifan Wang
207
1
0
09 Feb 2025
Adversarial ML Problems Are Getting Harder to Solve and to Evaluate
Javier Rando
Jie Zhang
Nicholas Carlini
F. Tramèr
AAML
ELM
61
3
0
04 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
62
14
0
04 Feb 2025
Auction-Based Regulation for Artificial Intelligence
Marco Bornstein
Zora Che
Suhas Julapalli
Abdirisak Mohamed
Amrit Singh Bedi
Furong Huang
30
0
0
02 Oct 2024
1