High Dimensional Correlation Networks And Their Applications
Add to Google Calendar
Analysis of interactions between variables in a large data set has recently attracted special attention in the context of high dimensional multivariate statistical analysis. Variable interactions play a role in many inference tasks, such as, classification, clustering, estimation, and prediction. In this defense I focus on the discovery of correlation and partial correlation structures as well as their applications in high dimensional data analysis and inference. I consider problems of screening correlation and partial correlation networks by thresholding the sample correlation or the sample partial correlation matrix. The selection of the threshold is guided by our high dimensional asymptotic theory for screening such networks. Scalable methods of edge and hub screening are developed for applications in spatio-temporal analysis of time series, variable selection for linear prediction, and support recovery. The proposed methods are specifically designed for very high dimensional data with limited number of samples. Moreover the proposed correlation screening theory provides high dimensional family-wise error rates on false discoveries.