The Broad Optimality of Profile Maximum Likelihood

06/10/2019
by   Yi Hao, et al.
0

We study three fundamental statistical-learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide range of learning tasks. In particular, for every alphabet size k and desired accuracy ε: Distribution estimation Under ℓ_1 distance, PML yields optimal Θ(k/(ε^2 k)) sample complexity for sorted-distribution estimation, and a PML-based estimator empirically outperforms the Good-Turing estimator on the actual distribution; Additive property estimation For a broad class of additive properties, the PML plug-in estimator uses just four times the sample size required by the best estimator to achieve roughly twice its error, with exponentially higher confidence; α-Rényi entropy estimation For an integer α>1, the PML plug-in estimator has optimal k^1-1/α sample complexity; for non-integer α>3/4, the PML plug-in estimator has sample complexity lower than the state of the art; Identity testing In testing whether an unknown distribution is equal to or at least ε far from a given distribution in ℓ_1 distance, a PML-based tester achieves the optimal sample complexity up to logarithmic factors of k. With minor modifications, most of these results also hold for a near-linear-time computable variant of PML.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro