Monte Carlo approximation certificates for k-means clustering

10/03/2017
by   Dustin G. Mixon, et al.
0

Efficient algorithms for k-means clustering frequently converge to suboptimal partitions, and given a partition, it is difficult to detect k-means optimality. In this paper, we develop an a posteriori certifier of approximate optimality for k-means clustering. The certifier is a sub-linear Monte Carlo algorithm based on Peng and Wei's semidefinite relaxation of k-means. In particular, solving the relaxation for small random samples of the dataset produces a high-confidence lower bound on the k-means objective, and being sub-linear, our algorithm is faster than k-means++ when the number of data points is large. We illustrate the performance of our algorithm with both numerical experiments and a performance guarantee: If the data points are drawn independently from any mixture of two Gaussians over R^m with identity covariance, then with probability 1-O(1/m), our poly(m)-time algorithm produces a 3-approximation certificate with 99

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro