Verifiable Data Science via Interactive Proofs: What's Possible and What Isn't

Abstract

When and how can we guarantee that the conclusions arrived at by a complicated and expensive data analysis are correct? A sequence of recent works explores the possibility of constructing interactive proof systems that can verify the conclusions using less data and computation than would be needed to replicate the analysis. We would also like the resources needed to construct the proof to be as small as possible (ideally close to the cost of simply performing the analysis).

I will survey this line of work, highlighting two recent results:
- On the positive side, we construct protocols that allow efficient verification for a rich class of analyses
- On the negative side, we show that constructing the proof can require significantly more data than would be needed to simply run the analysis, showing that this is true for several natural data analysis tasks

Based on joint works with Tal Herman

Verifiable Data Science via Interactive Proofs: What's Possible and What Isn't

Abstract

Video Recording