Dissertation Defense

Energy-Efficient Neural Network Architectures

Hsi-Shou Wu
SHARE:

Abstract:

As the AI era, dedicated chip architectures are expected to play a major role in adding intelligence to devices we interact with while satisfying power and performance constraint. Among numerous machine learning techniques proposed, neural networks have become one of the fundamental building blocks for designing modern AI systems. However, due to the rapid growth of network size and depth, deep neural networks introduce power and performance overheads to a system.

This dissertation focuses on energy-efficient neural network architectures. First, a deep-learning processor is presented for achieving ultra-low-power operation. Using a heterogeneous architecture that includes a low-power always-on front-end and a selectively-enabled high-performance back-end, the processor dynamically adjusts computational resources at runtime to support conditional execution in neural networks with increased energy efficiency. Featuring a reconfigurable datapath and a memory architecture optimized for energy efficiency, the processor supports multilevel dynamic activation of neural network segments, performing object detection tasks with 5.3x lower energy consumption in comparison with a static baseline design. Fabricated in 40nm CMOS, the test-chip dissipates 0.23mW at 5.3 fps.

To further improve the energy efficiency of presented heterogeneous architecture, a zero-short-circuit-current (ZSCC) logic is proposed to decrease the power consumption of the always-on front-end. By dedicated circuit topologies, ZSCC realizes order-of-magnitude power savings at relatively low clock frequency. The efficiency and applicability of ZSCC is demonstrated through an ANSI S1.11 1/3 octave filter bank chip for binaural hearing aids. The 65nm test-chip consumes 13.8uW at 1.75MHz clock rate, achieving 9.7x power reduction in comparison with a 40nm state-of-the-art design. The ability of ZSCC to further increase the energy efficiency of the heterogeneous neural network architecture is demonstrated through the design and evaluation of a ZSCC-based front-end. Simulation results show 17x power reduction compared with a conventional static CMOS implementation of the same architecture.

Sponsored by

Professor Marios C. Papaefthymiou and Zhengya Zhang