Dissertation Defense

Design and Implementation of Domain-Specific Programmable Spatial Accelerators

Kuan-Yu Chen
3316 EECS BuildingMap
Kuan-Yu Chen Defense Photo

Prompted by the demise of Dennard’s scaling, computer architects aim to bridge the gap between increasing computational demands and stagnating transistor budgets using hardware accelerators. Among various designs, spatial architectures offer Data Level Parallelism and high data reuse by connecting Processing Elements through on- chip networks. However, the application space for customized designs targeting specific functions is limited. The gain through stacking accelerators will eventually be constrained by on-chip resources, leading to the ‘accelerator wall.’ Therefore, making accelerators programmable to balance efficiency, performance, and flexibility is crucial, resulting in the development of domain-specific programmable spatial accelerators, which offer reconfigurability and adaptability.

This thesis explores the design and implementation of domain-specific programmable spatial accelerators through three distinctive works. The first work, FlexTPU, is a Tensor Processing Unit-like accelerator capable of Sparse Matrix-Vector Multiplication while retaining General Matrix Multiplication capabilities. The second work, DAP, introduces a domain adaptive processor with a custom Instruction Set Architecture targeting wireless communication and linear algebra kernels. A prototype chip of DAP has been fabricated in 12nm FinFET and measured. The final work, Canalis, presents a framework including a programmable spatial accelerator with a co-designed accessible software stack optimized for stream processing in wireless communication.


Chair: Professor David T. Blaauw