Abstract

We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan's generator over larger alphabets. Our generator allows a trade-off between seed length and the time needed to compute a given block of the generator's output. HashPRG can be used to obtain derandomizations with much better update time and \emph{without sacrificing space} for a large number of data stream algorithms, such as Fp estimation in the parameter regimes p>2 and 0<p<2 and CountSketch with tight estimation guarantees as analyzed by Minton and Price (SODA 2014) which assumed access to a random oracle. We also show a recent analysis of Private CountSketch can be derandomized using our techniques.

For a d-dimensional vector x being updated in a turnstile stream, we show that ‖x‖∞ can be estimated up to an additive error of ε‖x‖2 using O(ε−2log(1/ε)logd) bits of space. Additionally, the update time of this algorithm is O(log1/ε) in the Word RAM model. We show that the space complexity of this algorithm is optimal up to constant factors. However, for vectors x with ‖x‖∞=Θ(‖x‖2), we show that the lower bound can be broken by giving an algorithm that uses O(ε−2logd) bits of space which approximates ‖x‖∞ up to an additive error of ε‖x‖2. We use our aforementioned derandomization of the CountSketch data structure to obtain this algorithm, and using the time-space trade off of HashPRG, we show that the update time of this algorithm is also O(log1/ε) in the Word RAM model.

Video Recording