Learning correlations in large scale data
Add to Google Calendar
One of the principal questions in data science is how to learn patterns of correlation, also known as correlation mining, when there are a massive number of variables and few samples of them. Correlation mining can be framed as the mathematical problem of reliably reconstructing different attributes of the correlation matrix, or its inverse, from the sample covariance matrix empirically constructed from the data. Reconstructing some attributes requires relatively few samples, e.g., screening for the presence of variables that are hubs of high correlation in a sparsely correlated population, while others require many more samples, e.g., accurately estimating all entries of the inverse covariance matrix in a densely correlated population. We will present the learning of correlations problem in the context of multi-dimensional signal processing applications.