Computer Vision Seminar

Graph Clustering, Convex Optimization and Crowdsourcing

Ramya Korlakai VinayakPhD CandidateCalifornia Institute of Technology

Clustering is arguably the most commonly used unsupervised learning method. The datasets to be clustered often have a graph structure, e.g., social networks, protein-protein interaction networks etc., and are usually partially observed since it is too expensive to observe the entire graph. We therefore consider the problem of finding clusters in graphs that are partially observed. We propose and analyze two convex optimization based graph clustering algorithms and provide explicit conditions in terms of the size and sparsity of clusters, and the number of observations required to guarantee successful recovery of the clusters.

We apply this setting to crowdsourced clustering, viz. the problem of clustering unlabeled items using answers from non-expert crowd workers. Workers are often not able to label the items directly, however, they can compare them and judge whether they are of the same category. As the workers are not experts, the answers obtained are noisy. It is crucial to design queries that can reduce the noise levels in the responses. We compare two types of random queries: edge queries, where a pair of items is revealed, and triangle queries, where a triple is. Under natural modeling assumptions, we show that the triangle queries provide more reliable data. We complement our theoretical results with experiments on real datasets on Amazon Mechanical Turk. Our experiments suggest that the triangle queries not only provide more reliable edges but also reveal many more of them for a fixed budget and using convex algorithms to denoise the observed data substantially reduces the number of errors in clustering.

Ramya Korlakai Vinayak is a final year Ph.D. candidate at Caltech working with Prof. Babak Hassibi. Her research interests include Machine Learning, Convex Optimization and Human Computation. She is the recipient of the Schlumberger Foundation Faculty for the Future Fellowship 2013-15 and was an invited participant at the Rising Stars in EECS Workshop in MIT 2015. Before Caltech, she obtained her bachelors degree in Electrical Engineering from IIT Madras, India.

Sponsored by

ECE - Systems

Faculty Host

Laura Balzano