Dissertation Defense

Scalable Algorithms Using Optimization on Orthogonal Matrix Manifolds

Kyle Gilman
GM Conference Room, Lurie Engineering Center (4th floor)Map

Modern machine learning in signal processing, business, and science often seeks to extract interpretable representations from large amounts of raw messy data. However, many traditional unsupervised learning algorithms cannot handle or scale to data with missing entries, heterogeneous quality, or streaming observations. This dissertation addresses these challenges and proposes scalable, efficient algorithms using low-rank modeling and disciplined optimization.

The first half of this talk proposes heteroscedastic probabilistic principal component analysis (HPPCA) to learn the maximum likelihood estimates of the low-rank factors and unknown noise variances of data groups with heteroscedastic noise. Our model achieves superior performance compared to PCA and gracefully leverages additional noisy samples. This leads us to naturally derive an online algorithm for streaming data containing missing entries, for which we show subspace estimation and tracking improvement compared to state-of-the-art streaming PCA algorithms.

The second half studies a nonconvex problem on the Stiefel manifold originating from the HPPCA log-likelihood uniquely distinct from PCA and not readily solved by the singular value decomposition. We derive a novel semidefinite relaxation of the problem, show a dual certificate of our SDP to certify global optimality of local solutions, and finally we prove the existence of tight SDP problems for HPPCA.


Chair: Professor Laura Balzano