We consider the following scenario motivated by the rise of cloud computing. A client needs to process a massive stream of data, arriving online, but is unable to store even a small fraction of the data. It outsources the processing to a commercial cloud computing service, but is unwilling to blindly trust answers returned by this service. How can we design a suitable communication protocol that allows the server to not just provide the desired answer but prove to the client that the answer is correct?
This general question has been the subject of several recent research works. In this talk we shall focus on one particular problem of great practical interest: Nearest Neighbour Search. Our main protocol allows a client to find, with proof (in the above sense) the nearest neighbour to a query point in a stream of data points, while storing barely more than a constant number of points. In fact, the protocol we design is considerably more general, allowing us to solve many other data streaming problems, and leading to a rich theory within communication complexity, known as Arthur-Merlin communication. The talk will outline this theory broadly.
Based on joint work with Cormode, McGregor, Thaler, and Venkatasubramanian.