
Reliability, Robustness and Minipatch Learning

April 11, 2021
Many have noted and lamented a reproducibility crisis in science with more recent discussion and interest on the reproducibility and reliability of data science and machine learning techniques.  In this talk, I will introduce the Four R's, a tiered framework for discussing and assessing the reproducibility, replicability, reliability, and robustness of a data science or machine learning pipeline.  Then, I will introduce a new minipatch learning framework that helps to improve the reliability and robustness of machine learning procedures.  Inspired by stability approaches from high-dimensional statistics, random forests, and dropout training in deep learning, minipatch learning is an ensemble approach where we train on very tiny randomly or adaptively chosen subsets of both observations and features or parameters.  Beyond the obvious computational and memory efficiency advantages, we show that minipatch learning also yields more reliable and robust solutions by providing implicit regularization.