Communications and Signal Processing Seminar
Learning Compositional Sparse Models of Bimodal Percepts
Add to Google Calendar
Various perceptual domains have underlying compositional semantics that are rarely captured in current models. We suspect this is because directly learning the compositional structure has evaded these models. Yet, the compositional structure of a given domain can be grounded in a separate domain thereby simplifying its learning. To that end, we propose a new approach to modeling bimodal percepts that explicitly relates distinct projections across each modality and then jointly learns a bimodal sparse representation. We jointly consider vision and speech modalities. The resulting model enables compositionality across these distinct projections and hence can generalize to unobserved percepts spanned by this compositional basis. For example, our model can be trained on images and utterances of "red triangles" and "blue squares" yet, implicitly will also have learned "red squares" and "blue triangles." The structure of the projections and hence the compositional basis is learned automatically for a given language model. This work hence takes a stark turn from the standard supervised machine learning approach to modeling by grounding the percepts in language rather than predefined categorical classes. To test our model, we have acquired a new bimodal dataset comprising images and spoken utterances of colored shapes in a tabletop setup. Our experiments demonstrate the benefits of explicitly leveraging compositionality in both quantitative and human evaluation studies.
Joint work with Suren Kumar and Vikas Dhiman
Corso is an associate professor of Electrical Engineering and Computer Science at the University of Michigan. He received his PhD and MSE degrees at The Johns Hopkins University in 2005 and 2002, respectively, and the BS Degree with honors from Loyola College In Maryland in 2000, all in Computer Science. He spent two years as a post-doctoral fellow at the University of California, Los Angeles. From 2007-14 he was a member of the Computer Science and Engineering faculty at SUNY Buffalo. He is the recipient of the Army Research Office Young Investigator Award 2010, NSF CAREER award 2009, SUNY Buffalo Young Investigator Award 2011, a member of the 2009 DARPA Computer Science Study Group, and a recipient of the Link Foundation Fellowship in Advanced Simulation and Training 2003. Corso has authored more than ninety peer-reviewed papers on topics of his research interest including computer vision, robot perception, data science, and medical imaging. He is a member of the AAAI, IEEE and the ACM.