Abstract

Consider an instance of Euclidean k-means or k-medians clustering. We prove that a dimension reduction projection onto a space of dimension d ~ log k preserves the cost of the optimal clustering within a factor of 1 + epsilon w.h.p. Crucially, the dimension d does not depend on the total number of points n in the instance. The result also applies to other variants of the k-clustering problem. The result strengthens the result by Cohen, Elder, Musco, Musco, and Persu, who showed that the value of k-means is approximately preserved when d ~ k. No bounds on d were previously known for k-medians. Joint work with Konstantin Makarychev and Ilya Razenshteyn.

Video Recording