Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent

10/04/2022
by   Michael Kohler, et al.
0

Estimation of a regression function from independent and identically distributed random variables is considered. The L_2 error with integration with respect to the design measure is used as an error criterion. Over-parametrized deep neural network estimates are defined where all the weights are learned by the gradient descent. It is shown that the expected L_2 error of these estimates converges to zero with the rate close to n^-1/(1+d) in case that the regression function is Hölder smooth with Hölder exponent p ∈ [1/2,1]. In case of an interaction model where the regression function is assumed to be a sum of Hölder smooth functions where each of the functions depends only on d^* many of d components of the design variable, it is shown that these estimates achieve the corresponding d^*-dimensional rate of convergence.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro