Abstract

Given a dataset of vectors and a kernel function, we give a data structure for density queries: given a vector, output an approximation to the sum of pairwise kernel evaluations. Our main result is a new way to combine discrepancy theory and LSH to build sparse coresets and speed up the best known data structures for smooth kernels. Our work combines two lines of works, by Backurs, Indyk, Charikar, Siminelakis [2018] and Phillips and Tai [2019] and shows how to obtain the best of both. Joint work with Moses Charikar and Michael Kapralov: https://arxiv.org/abs/2401.02562.