61
13

Adversarial Examples in Random Neural Networks with General Activations

Abstract

A substantial body of empirical work documents the lack of robustness in deep learning models to adversarial examples. Recent theoretical work proved that adversarial examples are ubiquitous in two-layers networks with sub-exponential width and ReLU or smooth activations, and multi-layer ReLU networks with sub-exponential width. We present a result of the same type, with no restriction on width and for general locally Lipschitz continuous activations. More precisely, given a neural network f(;θ)f(\,\cdot\,;{\boldsymbol \theta}) with random weights θ{\boldsymbol \theta}, and feature vector x{\boldsymbol x}, we show that an adversarial example x{\boldsymbol x}' can be found with high probability along the direction of the gradient xf(x;θ)\nabla_{{\boldsymbol x}}f({\boldsymbol x};{\boldsymbol \theta}). Our proof is based on a Gaussian conditioning technique. Instead of proving that ff is approximately linear in a neighborhood of x{\boldsymbol x}, we characterize the joint distribution of f(x;θ)f({\boldsymbol x};{\boldsymbol \theta}) and f(x;θ)f({\boldsymbol x}';{\boldsymbol \theta}) for x=xs(x)xf(x;θ){\boldsymbol x}' = {\boldsymbol x}-s({\boldsymbol x})\nabla_{{\boldsymbol x}}f({\boldsymbol x};{\boldsymbol \theta}).

View on arXiv
Comments on this paper