Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.03750
Cited By
JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks
7 April 2019
N. Benjamin Erichson
Z. Yao
Michael W. Mahoney
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks"
5 / 5 papers shown
Title
Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
Gouki Minegishi
Hiroki Furuta
Yusuke Iwasawa
Y. Matsuo
49
1
0
09 Jan 2025
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Javier Ferrando
Oscar Obeso
Senthooran Rajamanoharan
Neel Nanda
77
10
0
21 Nov 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
75
19
0
02 Jul 2024
Kryptonite: An Adversarial Attack Using Regional Focus
Yogesh Kulkarni
Krisha Bhambani
AAML
19
3
0
23 Aug 2021
Adversarial examples in the physical world
Alexey Kurakin
Ian Goodfellow
Samy Bengio
SILM
AAML
257
5,833
0
08 Jul 2016
1