Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)

08/17/2018
by   Ying Zhu, et al.
0

We develop simple and non-asymptotically justified methods for hypothesis testing about the coefficients (θ^*∈R^p) in the high dimensional generalized regression models where p can exceed the sample size. Given a function h: R^pR^m, we consider H_0: h(θ^*) = 0_m against H_1: h(θ^*)≠0_m, where m can be any integer in [1, p] and h can be nonlinear in θ^*. Our test statistics is based on the sample "quasi score" vector evaluated at an estimate θ̂_α that satisfies h(θ̂_α)=0_m, where α is the prespecified Type I error. By exploiting the concentration phenomenon in Lipschitz functions, the key component reflecting the dimension complexity in our non-asymptotic thresholds uses a Monte-Carlo approximation to mimic the expectation that is concentrated around and automatically captures the dependencies between the coordinates. We provide probabilistic guarantees in terms of the Type I and Type II errors for the quasi score test. Confidence regions are also constructed for the population quasi-score vector evaluated at θ^*. The first set of our results are specific to the standard Gaussian linear regression models; the second set allow for reasonably flexible forms of non-Gaussian responses, heteroscedastic noise, and nonlinearity in the regression coefficients, while only requiring the correct specification of E(Y_i | X_i)s. The novelty of our methods is that their validity does not rely on good behavior of θ̂_α - θ^*_2 (or even n^-1/2 X(θ̂_α - θ^*)_2 in the linear regression case) nonasymptotically or asymptotically.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro