Dissertation Defense

Datacenter Design for Future Cloud Radio Access Network

Qi Zheng
SHARE:

Cloud radio access network (C-RAN), an emerging cloud service that combines the traditional radio access network (RAN) with cloud computing technology, has been proposed as a solution in the future to handle the exceeding energy consumption and cost of the traditional RAN. Through aggregating baseband units (BBUs) in a centralized cloud datacenter, C-RAN reduces energy and cost, and improves wireless throughput and quality of service. However, designing a datacenter for C-RAN has not been studied. In this dissertation, I investigate how a datacenter for C-RAN BBUs should be built on commodity servers.

I first design WiBench that contains the key signal processing kernels of mainstream wireless protocols, and study their characteristics. The characterization study shows that there is abundant data and thread level parallelism. Based on this, I develop high performance software implementations of C-RAN BBUs in C++ and CUDA for both CPUs and GPUs. I generalize the GPU parallelization techniques of the Turbo decoder to the trellis algorithms, a family of algorithms widely used in data compression and error correction.

Second, I evaluate the performance of commodity CPU and GPU servers. The study shows that the datacenter with GPU servers meet the LTE standard throughput with 16— fewer machines, 21— less energy and 6— less cost than with CPU servers. Thus, I propose the C-RAN datacenter be built using GPUs as the server platform. Then I study resource management techniques to handle the temporal and spatial traffic imbalance in a C-RAN datacenter. I propose a "hill-climbing" power management that combines powering-off GPUs and DVFS to match the temporal C-RAN traffic pattern. Under a practical traffic model, this technique saves 40% energy in a GPU-based C-RAN datacenter. For spatial traffic imbalance, I propose three workload distribution techniques to improve load balance. Among all three techniques, pipelining packets has the most throughput improvement at 10% and 16% for balanced and unbalanced loads, respectively.

Sponsored by

Trevor Mudge and Ronald Dreslinski