Towards Interpretable Data Science

Abstract

Data Science involves complex processing over large-scale data for decision support, and much of this processing is done by black boxes such as Data Cleaning Modules, Database Management Systems, and Machine Learning modules. Decision support should be transparent but the combination of complex computation and large-scale data yields many challenges in this respect. Interpretability has been extensively studied in both the data management and in the machine learning communities, but the problem is far from being solved. I will present an holistic approach to the problem that is based on two facets, namely counterfactual explanations and attribution-based explanations. I will demonstrate the conceptual and computational challenges, as well as some main results we have achieved in this context.

Relevant papers:

Daniel Deutch, Amir Gilad, Tova Milo, Amit Mualem, Amit Somech:

FEDEX: An Explainability Framework for Data Exploration Steps, Proc. VLDB Endow. 15(13): 3854-3868 (2022)

Daniel Deutch, Nave Frost, Benny Kimelfeld, Mikaël Monet:

Computing the Shapley Value of Facts in Query Answering. SIGMOD 2022

Daniel Deutch, Nave Frost:

Constraints-Based Explanations of Classifications. ICDE 2019

Towards Interpretable Data Science

Abstract

Video Recording