MathInstitutes Dev

Scalable deep learning: using parallel algorithms and HPC systems to train large models on big data set

Presenter

Brian Van Essen

September 17, 2018

Scalable deep learning: using parallel algorithms and HPC systems to train large models on big data set Thumbnail

Abstract

Brian Van Essen Lawrence Livermore National Laboratory This talk will present the major approaches to parallelizing deep learning training and how they are applied in current deep learning toolkits. We will introduce the Livermore Deep Learning Neural Network toolkit (LBANN) that is specifically optimized to combine multiple levels of parallelism on HPC systems. Additionally, we will discuss how these techniques in scale on HPC systems, and the impact of hardware architectures optimized for deep learning workloads impacts these approaches. Finally, we will discuss some of the major challenges around the issues of communication and I/O.

Abstract

Videos

Scalable deep learning: using parallel algorithms and HPC systems to train large models on big data set

Presenter

Abstract