Dissertation Defense

Unified Models for Recovering Semantics and Geometry from Scenes

Byung-soo Kim

Scene understanding is an important yet very challenging problem in computer vision. In the last few years, substantially different approaches have been adopted for understanding "things' (object categories that have a well defined shape such as people and cars), "stuff' (object categories which have an amorphous spatial extent such as grass and sky), and "geometry' of scenes.

In this thesis, we propose coherent models to recognize things, stuff, and geometry simultaneously. The key contribution is to model the relationship between them in a coherent framework which can be efficiently solved. We show that each task can be improved by solving the other tasks through extensive experimental validations. The proposed models are capable of handling different types of inputs such as RGB, RGB-D, or hierarchically organized images to improve the system.

Sponsored by

Prof. Silvio Savarese, Prof. Honglak Lee