Interpreting Deep Neural Networks (DNNs)

Abstract

In this talk, I'd like to discuss two projects interpreting DNNs. The first project proposes the DeepTune framework as a way to elicit
interpretations of DNN-based models of single neurons in the difficult primate visual cortex area V4. Using DNN-based features, DeepTune images combine 18 accurately predictive regression models through a stability criterion. They provide characterizations of 71 V4 neurons and data-driven stimuli for closed-loop experiments.

The second project introduces agglomerative contextual decomposition (ACD) for hierarchical interpretations of DNN predictions. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. We also find that ACD's hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.

Interpreting Deep Neural Networks (DNNs)

Abstract

Video Recording