Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?

10/18/2019
by   Simran Kaur, et al.
62

For a standard convolutional neural network, optimizing over the input pixels to maximize the score of some target class will generally produce a grainy-looking version of the original image. However, researchers have demonstrated that for adversarially-trained neural networks, this optimization produces images that uncannily resemble the target class. In this paper, we show that these "perceptually-aligned gradients" also occur under randomized smoothing, an alternative means of constructing adversarially-robust classifiers. Our finding suggests that perceptually-aligned gradients may be a general property of robust classifiers, rather than a specific property of adversarially-trained neural networks. We hope that our results will inspire research aimed at explaining this link between perceptually-aligned gradients and adversarial robustness.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro