On approximating ∇ f with neural networks

10/28/2019
by   Saeed Saremi, et al.
0

Consider a feedforward neural network ψ: R^d→R^d such that ψ≈∇ f, where f:R^d →R is a smooth function, therefore ψ must satisfy ∂_j ψ_i = ∂_i ψ_j pointwise. We prove a theorem that for any such ψ networks, and for any depth L>2, all the input weights must be parallel to each other. In other words, ψ can only represent one feature in its first hidden layer. The proof of the theorem is straightforward, where two backward paths (from i to j and j to i) and a weight-tying matrix (connecting the last and first hidden layers) play the key roles. We thus make a strong theoretical case in favor of the implicit parametrization, where the neural network is ϕ: R^d →R and ∇ϕ≈∇ f. Throughout, we revisit two recent unnormalized probabilistic models that are formulated as ψ≈∇ f and also discuss the denoising autoencoders in the end.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro