Dissertation Defense

Adaptive Acoustic Beamforming and Speech Processing Front-end

Taewook Kang

Passcode: 1122


Beamforming with multiple microphones is essential for Automatic Speech Recognition (ASR) in earbuds, cell phones, and smart speakers. However, due to the unique characteristics of a speech signal system, the ASR frontend presents challenges to both ADCs and digital beamforming. First, the audio frontend ADC requires high SNR (>80dB) for high-quality audio processing and speech recognition. Second, the beamformer requires complex calculations since speech is a wideband signal ranging from 30 to 8kHz. Third, battery-powered operation strictly limits power consumption.

This work includes three designed ICs. The first and second chips show a synergistic approach of ADC, beamforming, and feature extraction, reduces processing complexity and die area, and delivers the high SNR required for reliable speech recognition. A third chip introduces a multi-mode continuous noise-shaping SAR with a newly proposed adaptive beamformer. Hence, the ASR frontend can optimize its power depending on the target signal and the noise situation. Furthermore, the proposed beamformer replaces the conventional blocking matrix with a new greedy blocking matrix. The new blocking matrix reduces input signal distortion and can compensate for input steering angle errors with low power consumption.

Chair: Professor Michael P. Flynn