Minimax Rates for Batched Stochastic Optimization

Abstract

We study the stochastic batched convex optimization problem, in which we use many \emph{parallel} observations to optimize a convex function given limited rounds of interaction. In each of M rounds, an algorithm may query for information at n points, and after issuing all n queries, it receives unbiased noisy function and/or (sub)gradient evaluations at the n points. After M such rounds, the algorithm must output an estimator. We provide lower and upper bounds on the performance of such batched convex optimization algorithms in zeroth and first-order settings for the collections of Lipschitz convex and smooth strongly convex functions. The rates we provide exhibit two interesting phenomena: (1) in terms of the batch size n, the rate of convergence of batched algorithms (nearly) achieves the conventional fully sequential rate once M=O(d log log n), where d is the dimension of the domain, while (2) the rate may exponentially degrade as the dimension d increases, in distinction from fully sequential settings.

Attachment

Minimax Rates for Batched Stochastic Optimization