Dissertation Defense

3D Scene Understanding with Deep Learning

Junming Zhang

Password: goblue


3D scene understanding is crucial for robotics, augmented reality and autonomous vehicles. In those applications, the 3D structure can be computed by using stereo cameras or depth sensors and the outputs are often in the format of depth images or point clouds. The recent improved accessibility of those 3D measurements requires the need of algorithms to interpret them. Therefore, this dissertation seeks to develop algorithms for 3D scene understanding with deep learning techniques for depth images and point clouds.

The first portion describes an algorithm to estimate accurate depth maps from stereo images. In particular, the semantic embedding learned from semantic segmentation is used to guide the disparity estimation, especially for smooth, reflective and occluded regions. The second portion addresses the challenges when processing point cloud data which may be arbitrarily rotated and partially occluded. This dissertation proposes a 3D representation of point clouds that is designed to be rotationally invariant and introduces a novel neural network architecture to utilize this representation; for partial point cloud analysis, the embedding features from encoder are investigated to effectively infer the information contained by a complete shape.


Co-chair: Professor Matthew Johnson-Roberson

Co-chair: Professor Andrew Owens