Large Scale Dense 3D Reconstruction via Sparse Representations

Wei Dong

PhD Thesis

Abstract

Dense 3D scene reconstruction is in high demand today for view synthesis, navigation, and autonomous driving. A practical reconstruction system inputs multi-view scans of the target using RGB-D cameras, LiDARs, or monocular cameras, computes sensor poses, and outputs scene reconstructions. These algorithms are computationally expensive and memory-intensive due to the presence of 3D data. Thus, it is essential to exploit sparsity adequately to reduce memory footprint, increase efficiency, and improve accuracy.

In this thesis, I will develop practical systems for fast and high-quality scene reconstruction. First, I will introduce a highly efficient hierarchical reconstruction system that serves as a foundational pipeline for integrating diverse pose estimation and scene reconstruction modules. Next, I will focus on the global registration of point clouds by learning deep features and their matches. Equipped with sparse convolutional networks, these studies define the state-of-the-art at the scene scale in both supervised and self-supervised setups. They are applied to reconstruction systems to produce globally consistent poses.

I will then shift to the topic of scene representation and reconstruction, introducing a modern engine, ASH, for parallel spatial hashing in the era of tensor and auto-differentiation. I will elaborate on the details of building this efficient and user-friendly engine from the ground up and discuss a series of downstream applications. These applications include real-time dense RGB-D SLAM, large-scale surface reconstruction from LiDAR scans, and fast scene reconstruction from monocular data. While achieving comparable or better accuracy than state-of-the-art methods, we demonstrate 2-10 times speed improvements with less development effort.

Citation

@phdthesis{Dong-2023-136767,
author = {Wei Dong},
title = {Large Scale Dense 3D Reconstruction via Sparse Representations},
year = {2023},
month = {May},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-23-29},
keywords = {SLAM, Spatial Hashing, 3D reconstruction, Differentiable Rendering},
}