Videos

Can we design deep learning models that are inherently interpretable?

Presenter
April 11, 2021
Abstract
Black box deep learning models are difficult to troubleshoot. In practice, it can be difficult to tell whether their reasoning process is correct, and ""explanations"" have repeatedly been shown to be ineffective. In this talk I will discuss two possible approaches to create deep learning methods that are inherently interpretable. The first is to use case-based reasoning, through a neural architecture called ProtoPNet, where an extra ""prototype"" layer in the network allows it to reason about an image based on how similar it looks to other images (the network says ""this looks like that""). Second, I will describe ""concept whitening,"" a method for disentangling the latent space of a neural network by decorrelating concepts in the latent space and aligning them along the axes. This Looks Like That: Deep Learning for Interpretable Image Recognition. NeurIPS spotlight, 2019. https://arxiv.org/abs/1806.10574 Concept Whitening for Interpretable Image Recognition. Nature Machine Intelligence, 2020. https://rdcu.be/cbOKj