Dissertation Defense

Compute- and Process-in-Memory Modules for Machine Learning and Neuromorphic Computing

Yuting Wu
1200 EECS BuildingMap

           Data movement is a major performance bottleneck of modern computing systems, which separate the memory and processing units. This thesis aims to explore efficient computing architectures that perform some computation workloads in or near memory to minimize data movement.

            I will start with a memristor-based Compute-in-Memory (CIM) system for the inference and training of convolution neural networks (CNNs). Co-optimization from device, algorithm, to system is implemented to alleviate the accuracy loss due to hardware non-idealities. Both on-chip measurements and hardware-aware simulations show that the system can achieve software-comparable accuracies.

            The second part of the talk will introduce a DRAM-based Process-in-Memory (PIM) with a custom ASIC to accelerate GPT inference tasks. By leveraging the parallel computing and data locality, the system achieves 41x~137x(639x~1074x) speedup compared to GPU (CPU) on 8 GPT models.

            In the third part, I will exploit the internal dynamics of memristors for neuromorphic computing. A second-order memristor-based system is proposed for neural functional connectivity detection, followed with a reservoir computing system for neural behavioral state decoding based on memristors with short-term memory. The dynamic memristors in both systems capture the hidden temporal features of the neural spikes and maintain good performance despite network variations and device non-idealities.


Chair: Professor Wei D. Lu