Videos

Factored Gradient Descent: on Dropping Convexity for Faster Optimization

Presenter
May 16, 2016
Keywords:
  • Non-convex optimization, matrix estimation, gradient descent
Abstract
Local algorithms like gradient descent are widely used in non-convex optimization, typically with few guarantees on performance. In this talk we consider the class of problems given by min_{U,V} f(UV’) where f is a convex function on the space of matrices. The problem is non-convex, but “only” because we adopt the bi-linear factored representation UV’, with tall matrices U,V. We delve into the geometry of the problem, and establish an O(1/t) local convergence rate for when f is smooth, and linear convergence when f is strongly convex. We show that the constants depend not only on the shape of f (as in the convex case) but also on the condition number of the optimal point. Factored gradient descent is widely used in matrix estimation; our work provides a generic (as opposed to specific application-dependent) view on its performance. (Joint work with Srinadh Bhojanapalli, Anastasios Kyrillidis and Dohyung Park)