Dissertation Defense

Compute- and Processing-in-Memory Modules for Machine Learning on Plaintext and Ciphertext

Yongmo Park
1005 EECS BuildingMap
Yongmo Park Defense Photo

Conventional computing systems based on the von Neumann architecture encounter significant performance bottleneck arising from the off-chip memory transfer between processing and memory units. Compute-in-memory (CiM) architectures based on resistive random-access memory (RRAM) can perform vector-matrix multiplication (VMM) operations in RRAM arrays efficiently, eliminating the need for off-chip data transfer. DRAM-based processing-in-memory (PiM) architecture is another suitable architecture for memory intensive applications with processing units near the DRAM banks leveraging higher internal bandwidth.

This thesis discusses CiM and PiM related works tailored to their applications. For the CiM related works, this thesis starts with explaining a first demonstrated high-entropy oxide (HEO) based memristor. This thesis then presents an RRAM-based CiM architecture to accelerate number theoretic transform (NTT) operations using the VMM approach, leading to significant latency improvements. We will also introduce a CiM-informed neural architecture search framework that explores both the neural network and CiM architecture parameters for multi-objective optimizations including model accuracy and throughput with an area constraint.

Lastly, this thesis explores a DRAM-based PiM architecture for fully homomorphic encryption (FHE). We developed a mapping algorithm to allow the NTT to be computed within a fixed permutation network. This PiM architecture can support all FHE primitives (e.g., bootstrapping) and end-to-end applications.


CHAIR: Professor Wei Lu