Videos

Robust and Efficient Spectral Clustering of Large Graphs

Presenter
October 24, 2011
Keywords:
  • Clustering
MSC:
  • 91C20
Abstract
Despite the empirical success of spectral clustering, its performance under noise and incomplete data is not well understood. This talk will provide a precise characterization of the misclustering rate of spectral clustering for large graphs. Using minimax theory, we establish information theoretic lower bounds on the amount of noise any clustering algorithm can tolerate and demonstrate that spectral clustering is near-optimal. The gap relative to the optimal performance is the statistical price that needs to be paid for computational efficiency. Using this analysis, we propose a novel spectral method which optimizes the number of edge similarities that need to be measured to guarantee consistent recovery of the clusters. In high dimensional settings, the clusters can be very small relative to the size of the graph and are often characterized by a few relevant features. To enable simultaneous extraction of such a cluster and the relevant features, we consider a sparsity penalized spectral biclustering method and compare its performance to minimax limits.