Laura Balzano (University of Michigan)
We discuss a novel approach to the subspace clustering problem that leverages ensembles of the K-subspaces (KSS) algorithm with random initializations. We extend existing results in subspace clustering to show that correct clustering can be achieved by any algorithm that obtains (possibly perturbed) measurements of any monotonic function of the inner product between points -- so it is possible to cluster with missing data, compressed data, or in other scenarios where only an approximation of inner products is available. We then show that a single iteration of KSS with random initialization produces such a measurement, and hence our algorithm can provably recover subspaces under the same conditions as state-of-the-art algorithms. The finding is, to the best of our knowledge, the first recovery guarantee for evidence accumulation clustering and for KSS variants. It also provides insight into random initializations of subspace arrangements. We show on synthetic data that our method performs well in the traditionally challenging settings of subspaces with large intersection, subspaces with small principal angles, and noisy data. Finally, we show our algorithm achieves state-of-the-art performance across several benchmark datasets, including a resulting error for the COIL-20 database that is less than half that achieved by existing algorithms.