Faculty Candidate Seminar
Community Detection in Networks: Algorithms, Complexity, and Information Limits
Add to Google Calendar
Many datasets can be viewed as networks where nodes represent objects and edges encode pairwise interactions between objects. An interesting problem is to identify communities consisting of similar objects based on the network topology. The problem is popularly known as community detection, and has found numerous applications such as recommending friends in online social networks, predicting user preferences in recommender systems, and discovering gene or protein functions in biology.
Learning communities in a large-scale network is both statistically and computationally challenging, and many different algorithms have been proposed over the years. Nevertheless, it remains unclear when it is computationally feasible to infer the communities. In this talk, we introduce two computationally efficient methods: belief propagation (BP) and semidefinite programming re- laxation (SDP), and derive a sharp characterization of their performance for estimating a single community under the stochastic block model. Surprisingly, if the community size is above a certain threshold depending on the network size, the performance limits of BP and SDP exactly match what can be achieved information-theoretically; however, if the community size is below the thresh- old, their performance limits are far from the information limit. Building on average case reductions, we show that there exists a significant gap between the information limit and what can be achieved by computationally efficient procedures, conditioned on the widely-believed planted clique hardness assumption. Finally, we demonstrate the effectiveness of SDP in real-world network datasets, and discuss extensions to multiple communities and general models.
This work is at the intersection of information theory, statistics, and computation. Based on joint work, available at arXiv 1406.6625, 1412.6156, 1502.07738, 1509.07859, and 1510.02786, with Bruce Hajek and Yihong Wu from UIUC.
Jiaming Xu received the B.E. degree from Tsinghua University in 2009, the M.S. degree from UT-Austin in 2011, and the Ph.D. degree from UIUC in 2014 under the supervision of Prof. Bruce Hajek, all in electrical and computer engineering. He is a research fellow in the Simons Institute for the Theory of Computing at UC Berkeley, supported by a Simons-Berkeley Research Fellowship, in Spring 2016. He was a postdoctoral fellow with Department of Statistics, The Wharton School, University of Pennsylvania, in 2015. His research interests are in network science, high-dimensional statistics, information theory, optimization, machine learning, game theory, communications, and networking.