Dissertation Defense

Eliciting and Leveraging Input Diversity in Crowd-Powered Intelligent Systems

Jean Y. Song
3316 EECS BuildingMap
Jean Y. Song


Collecting high quality annotations play a crucial role in supporting machine learning algorithms, and thus, the creation of intelligent systems. Over the past decade, crowdsourcing has become a widely adopted means of manually creating annotations for various intelligent tasks, spanning from object boundary detection in images to sentiment understanding in text. This thesis presents new crowdsourcing workflows and answer aggregation algorithms that can effectively and efficiently improve collective annotation quality from crowd workers. While conventional microtask crowdsourcing approaches generally focus on improving annotation quality by promoting consensus among workers, this thesis proposes a novel concept of diversity-driven approach. We show that leveraging diversity in workers’ responses is effective in improving the accuracy of aggregate annotations because it compensates biases or uncertainty caused by the system, tool, or the data. We then present techniques that elicit the diversity in worker’ responses. These techniques are orthogonal to existing quality control methods, such as filtering or training, which means they can be used in combination with existing methods. The crowd-powered intelligent systems presented in this thesis are evaluated on visual perception tasks in order to demonstrate the effectiveness of our proposed approach. The advantage of our approach is an improvement in collective quality even in settings where worker skill may vary widely, potentially lowering barriers to entry for novice workers and making it easier for requesters to find workers who can make productive contributions. This thesis demonstrates that crowd workers’ input diversity can be a useful property that yields better aggregate performance than any homogeneous set of input.

Chair: ProfessorWalter S. Lasecki