
All scheduled dates:
Upcoming
TITLE: Locally Uniform Hashing
Hashing is a common technique used in data processing, with a strong impact on the time and resources spent on computation. Hashing also affects the applicability of theoretical results that often assume access to (unrealistic) uniform/fully-random hash functions. In this paper, we are concerned with designing hash functions that are practical and come with strong theoretical guarantees on their performance.
To this end, we present tornado tabulation hashing, which is simple, fast, and exhibits a certain full, local randomness property that provably makes diverse algorithms perform almost as if (abstract) fully-random hashing was used. For example, this includes classic linear probing, the widely used HyperLogLog algorithm of Flajolet, Fusy, Gandouet, Meunier [AOFA'97] for counting distinct elements, and the one-permutation hashing of Li, Owen, and Zhang [NIPS'12] for large-scale machine learning. We also provide a very efficient solution for the classical problem of obtaining fully-random hashing on a fixed (but unknown to the hash function) set of n keys using O(n) space. As a consequence, we get more efficient implementations of the splitting trick of Dietzfelbinger and Rink [ICALP'09] and the succinct space uniform hashing of Pagh and Pagh [SICOMP'08].
Tornado tabulation hashing is based on a simple method to systematically break dependencies in tabulation-based hashing techniques.
Past
TITLE: Interior Point Methods with a Gradient Oracle
Gradient methods are popular optimization primitives, since they rely
entirely on first-order information, which is usually easy to obtain.
Nevertheless, many interesting problems, such as those appearing in
conic optimization involve hard constraints, which require the use of
more sophisticated techniques (e.g. cutting planes or interior point
methods). These typically involve exploiting higher order information
about the objective or the geometry of the domain, such as the Hessian
matrix.
In this whiteboard talk, I will show how to overcome this difficulty,
by efficiently executing an interior point method which does not have
direct access to second order derivatives, and is solely based on
gradient information.
The key ingredient of this method is a beautiful idea of
[Dunagan-Harvey '07], where it was shown how to solve linear sytems of
equations while simultaneously constructing a preconditioner. This
approach can be leveraged in the context of interior point methods,
where one needs to solve a sequence of linear systems with a slowly
changing matrix.
As a consequence, we can compute high precision solutions to linear
optimization problems over (reasonably well-conditioned) n-dimensional
convex bodies using O~(n) gradient queries to an appropriate barrier
function, plus an extra cubic runtime overhead.