Dissertation Defense

Coding Theory and Sketching for Distributed Optimization

Neophytos Charalambides

PASSWORD: sketching


    With the advent of massive datasets, distributed techniques for processing information and carrying out computations are expected to enable exceptional possibilities for engineering, data intensive sciences, and better decision making. Unfortunately, distributive computational networks are prone to straggling servers. In recent years, utilizing coding-theoretic techniques has proven to be a very powerful tool for recovering either exact or approximate computations in the presence of stragglers. Furthermore, existing algorithms for mathematical programming, which are the core component of techniques for processing large datasets, often prove ineffective for scaling to the extent of all available data. Over the past two decades, randomized dimensionality reduction has proven to be a very powerful tool for approximate computations over large datasets.

These two approaches to dealing with large datasets are referred to as “Coded Computing’’, and “Randomized Numerical Linear Algebra’’ or “Sketching’’. In this thesis, we introduce new methods for both these areas, extend existing methods, combine these two techniques, and show how sketching algorithms can be utilized to devise coded computing schemes. Additionally, in certain schemes, we provide security through sketching and polynomial codes. The applications we focus on are distributed steepest descent and stochastic steepest descent, matrix-matrix multiplication, graph sparsification, linear regression, matrix inversion and pseudo-inversion, which are vital components to optimization, machine learning, and arguably all engineering disciplines.


Co-Chairs: Professor Alfred Hero, Assistant Professor Mert Pilanci (Stanford University)