eCommons

 

3D Scene Understanding: From Segments To Volumes

Other Titles

Abstract

Segmentation is one of the fundamental computer vision problems and has been investigated over years. In this thesis, we present algorithms for RGB-D image segmentation, and more importantly, the additional information that can be inferred from segmentations: depth ordering, 3D surfaces, occlusion boundaries and volumes of objects. All these clues lead to a more comprehensive 3D understanding of the scene as well as a higher level RGB-D interpretation. Also in return some of these clues can provide important feedbacks and improve the final scene segmentation performance. We start by performing 3D depth interpretation from 2D color images only. We discover that the segment shapes enable us to learn the depth orderings of the objects. Specifically, from the initial segmentation we develop features to encode the information captured in boundaries and junctions. After a supervised learning procedure, our algorithm is able to produce a 3D depth ordering map from a single 2D color image. Secondly, we proceed to 3D scene understanding using RGB-D images. The recent development of the depth sensors improves the performance of the traditional computer vision algorithms by a margin. Therefore, besides using one single image, we incorporate depth information along with it, and parse the scene based on 3D interpretation. We aim at the applications such as 3D point interpolation, boundary detection and scene segmentation. In detail, we propose algorithm for 3D surface segmentation, and show that combining this 3D surface information with 2D color image achieves better performance for 3D interpolation. After that, we use both 2D color and 3D depth channels to find the occlusion and connected boundaries given a RGB-D scene. This serves as an extended 3D scene interpretation with a better understanding of occlusions between objects. Finally we perform a 3D volumetric reasoning of the RGB-D image with support and stability. Objects occupy physical space and obey physical laws. To truly understand a scene, we must reason about the space that objects in it occupy, and how each objects is supported stably by each other. In other words, we seek to understand which objects would, if moved, cause other objects to fall. This 3D volumetric reasoning is important for many scene understanding tasks, ranging from segmentation of objects to perception of a rich 3D, physically well-founded, interpretations of the scene. In this thesis, we propose a new algorithm to parse RGB-D images with 3D block units while jointly reasoning about the segments, volumes, supporting relationships and object stability. Our algorithm is based on the intuition that a good 3D representation of the scene is one that fits the depth data well, and is a stable, self-supporting arrangement of objects (i.e., one that does not topple). We design an energy function for representing the quality of the block representation based on these properties. Our algorithm fits 3D blocks to the depth values corresponding to image segments, and iteratively optimizes the energy function. Our proposed algorithm is the first to consider stability of objects in complex arrangements for reasoning about the underlying structure of the scene. Experimental results show that our stability-reasoning framework improves RGB-D segmentation and scene volumetric representation.

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

2014-01-27

Publisher

Keywords

Scene Understanding; RGB-D Segmentation; Computer Vision

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Chen, Tsuhan

Committee Co-Chair

Committee Member

Snavely, Keith Noah
Saxena, Ashutosh
Reeves, Anthony P
Chang, Yao-Jen

Degree Discipline

Electrical Engineering

Degree Name

Ph. D., Electrical Engineering

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record