Minimax Lower Bounds for Ridge Combinations Including Neural Nets

02/09/2017
by   Jason M. Klusowski, et al.
0

Estimation of functions of d variables is considered using ridge combinations of the form ∑_k=1^m c_1,kϕ(∑_j=1^d c_0,j,kx_j-b_k) where the activation function ϕ is a function with bounded value and derivative. These include single-hidden layer neural networks, polynomials, and sinusoidal models. From a sample of size n of possibly noisy values at random sites X ∈ B = [-1,1]^d , the minimax mean square error is examined for functions in the closure of the ℓ_1 hull of ridge functions with activation ϕ . It is shown to be of order d/n to a fractional power (when d is of smaller order than n ), and to be of order ( d)/n to a fractional power (when d is of larger order than n ). Dependence on constraints v_0 and v_1 on the ℓ_1 norms of inner parameter c_0 and outer parameter c_1 , respectively, is also examined. Also, lower and upper bounds on the fractional power are given. The heart of the analysis is development of information-theoretic packing numbers for these classes of functions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro