3D Data Representations

Learning about voxelization, point clouds, 3D meshes and implicit surfaces

October 23, 2020 • Pratulya Bubna

(left-right) Voxel • Point Cloud • Mesh • Implicit Surface (credits Mescheder, Lars et al.: "Occupancy Networks")

Introduction
Voxels
- Voxelization
Point Clouds
Meshes
Implicit Surfaces

Introduction

Data Representations: 2D vs 3D (credits Litany, Or: "Geometric Deep Learning: Introduction")

Unlike in 2D where we have images (array of pixels — a grid) as a well-defined choice of representation, there is no such consensus in case of 3D data. Different 3D data representations have varying geometric structure and properties. In this post, we’ll cover some of the commonly chosen 3D representations and discuss their properties.

Needless to say, different data representations entail different datasets, different deep learning architectures and sometimes different tasks as well.

Voxels

Voxels in 3D are analagous to pixels in 2D. Just as pixels are basic elements on a regular 2D grid, voxels are volumetric elements that make up volume in 3D space.

Imagine a voxel as a cube that represents a single data point on a regularly-spaced 3D grid. Many such voxels would approximate a continuous 3D surface.

Voxels don't have their positions encoded (inferred from relative positions) and can contain multiple scalar values like density, color, opacity etc.

(image credits: https://en.wikipedia.org/wiki/Voxel)

It is important to realise that voxels are discrete units (similar to pixels) and hence serve as an approximation to continuous surfaces. Thus it is bound to suffer from artifacts.
The principle of resolution follows — more the resolution, better the approximation of surfaces.
Voxels have cubic memory footprint and is a sparse representation.

(left-right) increasing resolution, increasing sparsity, decreasing occupancy (image credits Su, Hao et al.: "A Tutorial on 3D Deep Learning", CVPR 2017 (Stanford)")

Voxelization

Voxelization is the process of converting a geometric object from its continuous geometric representation into a set of voxels that approximate it.

It should be pointed that all occupied voxels (containing geometry) in the 3D grid would have value 1 and rest 0

Procedure (in layman’s terms)

Take a huge cube and span it across the whole model you wish to approximate
Evenly divide into smaller cubes and repeatedly subdivide every cell that contains geometry
Stop when you’re satisfied with the level of detail of approximation

Visualization of Octree Voxelization using a 2D case (QuadTree) (credits: Animated sparse voxel octrees)

Point Cloud

A point cloud is an unordered set of points in a space that approximates the geometry of 3D objects.

point cloud (left) sampled from the original surface (right)

A point cloud with N points will have N! permutations possible for its representation.

It does not have any inherent structure for representation and has no connectivity information. This creates difficulty in learning from point clouds directly as there is inherent ambiguity about the surface information.

Point clouds are a simple representation and are easy to capture using readily available technologies such as Kinect, LiDaR scanners etc. However, the acquired data from the environment is not always perfect (see next image).

image credits: Lindenbaum, Michael: "3D... Workshop, Institut Henri Poincare"

Characteristics of point cloud data (image credits: Alliez, Pierre: "Surface Reconstruction, SGP 2017 (UCL)")

3D Mesh

A 3D Mesh, or polygonal mesh, approximates surfaces via a set of 2D polygons in 3D space. A mesh structure consists of faces, a set of vertices (coordinates) in 3D space, and edges — a connectivity list that describes how the vertices are connected with each other.

(left) mesh (right) point cloud (credits: Litany, Or: "Geometric Deep Learning: Introduction")

A mesh provides an efficient, non-uniform representation of a shape: a small number of polygons (coarser) can cover large, simple surfaces; and, many higher resolution polygons (finer) can faithfully represent intricate, detailed geometry.

Most commonly, triangular meshes are used as they are non-planar, memory-efficient and can be rendered fast

Implicit Surface

(image credits: Mescheder, Lars et al.: "Occupancy Networks")

Implicit representations, eg. level sets, represent the surface as a continuous function. They are more expressive and are able to capture more geometrical information of 3D shapes.

Level sets, for instance, are equipped with mathematical formulations that permit the inclusion of geometric quantities such as surface orientation, smoothness and volume.

Park et al. in define implicit surface as a Signed Distance Function (SDF).

An SDF is a continuous function that, for a given spatial point, outputs the point’s distance to the closest surface, whose sign encodes whether the point is inside (negative) or outside (positive) of the watertight surface.

Implicit surface defined as an SDF
SDF<0 (inside) SDF>0(outside) SDF=0(on) the surface

the discussion section is below