15
15

Fast Differentiable Clipping-Aware Normalization and Rescaling

Abstract

Rescaling a vector δRn\vec{\delta} \in \mathbb{R}^n to a desired length is a common operation in many areas such as data science and machine learning. When the rescaled perturbation ηδ\eta \vec{\delta} is added to a starting point xD\vec{x} \in D (where DD is the data domain, e.g. D=[0,1]nD = [0, 1]^n), the resulting vector v=x+ηδ\vec{v} = \vec{x} + \eta \vec{\delta} will in general not be in DD. To enforce that the perturbed vector vv is in DD, the values of v\vec{v} can be clipped to DD. This subsequent element-wise clipping to the data domain does however reduce the effective perturbation size and thus interferes with the rescaling of δ\vec{\delta}. The optimal rescaling η\eta to obtain a perturbation with the desired norm after the clipping can be iteratively approximated using a binary search. However, such an iterative approach is slow and non-differentiable. Here we show that the optimal rescaling can be found analytically using a fast and differentiable algorithm. Our algorithm works for any p-norm and can be used to train neural networks on inputs with normalized perturbations. We provide native implementations for PyTorch, TensorFlow, JAX, and NumPy based on EagerPy.

View on arXiv
Comments on this paper