Classification via Multiple Hyperplanes: Loss functions, Overparametrization, and Interpolation
This event is free and open to the publicAdd to Google Calendar
Many well-established classification algorithms such as support vector machines (SVM) are originally proposed as large-margin classifiers from a single hyperplane. This dissertation is divided into two halves, each half studying classification from the perspective of using multiple hyperplanes.
The first half introduces a new framework for multiclass loss functions called the permutation-equivariant and relative margin-based (PERM) losses, inspired by multiclass classification with multiple hyperplanes. Using our framework, we establish statistical and optimization results on Weston-Watkins multiclass SVMs. Furthermore, we provide sufficient conditions for the classification-calibration of a general family of PERM losses. These sufficient conditions subsume all previously known and establish new classification-calibration results.
The second half focuses on hyperplane arrangement classifiers (HACs). When implemented as neural networks, we show that the HACs can be overparameterized yet still have small VC dimensions and further achieve minimax optimality (assuming the empirical risk minimization can be solved to optimality). By using an ensemble of randomly initialized HACs, we demonstrate for the first time an interpolating ensemble method that is consistent for a broad class of distributions in arbitrary dimensions. We discuss the significance of these results in the context of recent advances in the theory of overparameterized learning.
Chair: Professor Clay Scott