Abstract

Data Science involves complex processing over large-scale data for decision support, and much of this processing is done by black boxes such as Data Cleaning Modules, Database Management Systems, and Machine Learning modules.  Decision support should be transparent but the combination of complex computation and large-scale data yields many challenges in this respect. Interpretability has been extensively studied in both the data management and in the machine learning communities, but the problem is far from being solved. I will present an holistic approach to the problem that is based on two facets, namely counterfactual explanations and attribution-based explanations. I will demonstrate the conceptual and computational challenges,  as well as some main results we have achieved in this context.

Relevant papers:

Daniel Deutch, Amir GiladTova MiloAmit MualemAmit Somech:
FEDEX: An Explainability Framework for Data Exploration Steps, Proc. VLDB Endow. 15(13): 3854-3868 (2022)

Daniel Deutch, Nave FrostBenny KimelfeldMikaël Monet:
Computing the Shapley Value of Facts in Query Answering. SIGMOD 2022

Daniel Deutch, Nave Frost:
Constraints-Based Explanations of Classifications. ICDE 2019

Video Recording