Communications and Signal Processing Seminar

How Hessian Structure Explains Mysteries in Sharpness Regularization

Hossein MobahiSenior Research ScientistGoogle Research
3427 EECS BuildingMap

ABSTRACT: Recent work has shown that first-order methods like Sharpness-Aware Minimization (SAM) which implicitly penalizes second-order information can improve generalization in deep learning.  Seemingly similar methods like weight noise and gradient penalties often fail to provide such benefits. We show that these differences can be explained by the structure of the Hessian of the loss. First, we show that a common decomposition of the Hessian can be quantitatively interpreted as separating feature exploitation from feature exploration. The feature exploration, which can be described by the Nonlinear Modeling Error (NME) matrix, is commonly neglected in the literature since it vanishes at interpolation. Our work shows that the NME is in fact important as it can explain why gradient penalties underperform for certain architectures. Furthermore, we provide evidence that challenges the long-held equivalence of weight noise and gradient penalties. This equivalence relies on the assumption that the NME can be ignored, which we find does not hold for modern networks since they involve significant feature learning. Intriguingly, we find that regularizing feature exploitation but not feature exploration yields performance comparable to SAM. This suggests that properly controlling regularization on the two parts of the Hessian is important for the success of many second-order methods.

BIO: Hossein Mobahi is a senior research scientist at Google. His current interests revolve around the interplay between optimization and generalization in deep neural networks. Prior to joining Google in 2016, he was a postdoctoral researcher at CSAIL of MIT. He obtained his PhD in Computer Science from the University of Illinois at Urbana-Champaign (UIUC)

*** This event will take place in a hybrid format. The location for in-person attendance will be room 3427 EECS.   Attendance will also be available via Zoom.

Join Zoom Meeting https:

Meeting ID: 991 0245 1525

Passcode: XXXXXX (Will be sent via e-mail to attendees)

Zoom Passcode information is also available upon request to Sher Nickrand ([email protected]).

This seminar will be recorded and posted to the CSP Seminar website.

Please see full video by Sr Research Scientist, Hossein Mobahi

Faculty Host

Qing QuAssistant ProfessorUniversity of Michigan, Electrical and Computer Engineering