9.1. A 3-sensor rig exampleΒΆ

This is an example using rig_calibrator (Section 16.59) on images acquired in a lab with cameras mounted on the Astrobee robot. See Section 9 for more examples.

An illustration is in Fig. 16.27. The dataset for this example is available for download.

This robot has three cameras: nav_cam (wide field of view, using the fisheye distortion model), sci_cam (narrow field of view, using the radtan distortion model), and haz_cam (has depth measurements, with one depth xyz value per pixel, narrow field of view, using the radtan distortion model).

We assume the intrinsics of each sensor are reasonably well-known (but can be optimized later). Those are set in the rig configuration (Section 16.59.4). The images are organized as as in Section 16.59.2.

The first step is solving for the camera poses, for which we use theia_sfm (Section 16.70):

theia_sfm --rig_config rig_input/rig_config.txt \
  --images 'rig_input/nav_cam/*.tif
            rig_input/sci_cam/*.tif'            \
  --out_dir rig_theia

This tool will use the Theia flags file from share/theia_flags.txt in the software distribution, which can be copied to a new name, edited, and passed to this program via --theia_fags.

For this example, it is suggested to edit that flags file and set --feature_density=DENSE, as the default number of features found by Theia may be too low for the images in this dataset.

The created cameras can be visualized as:

view_reconstruction --reconstruction rig_theia/reconstruction-0

See an illustration in Section 16.72.

The solved camera poses are exported to rig_theia/cameras.nvm. The images and interest point matches can be visualized in a pairwise manner using stereo_gui (Section as:

stereo_gui rig_theia/cameras.nvm

The images in the nvm file will be in random order. This will be rectified by rig_calibrator.

Next, we run rig_calibrator:

float_intr="" # not floating intrinsics
rig_calibrator                                        \
    --rig_config rig_input/rig_config.txt             \
    --nvm rig_theia/cameras.nvm                       \
    --camera_poses_to_float "nav_cam sci_cam haz_cam" \
    --intrinsics_to_float "$float_intr"               \
    --depth_to_image_transforms_to_float "haz_cam"    \
    --float_scale                                     \
    --bracket_len 1.0                                 \
    --bracket_single_image                            \
    --num_iterations 100                              \
    --calibrator_num_passes 2                         \
    --registration                                    \
    --hugin_file control_points.pto                   \
    --xyz_file xyz.txt                                \
    --export_to_voxblox                               \
    --out_dir rig_out

The previously found camera poses are read in. They are registered to world coordinates (this is optional). For that, the four corners of a square with known dimensions visible in a couple of images were picked at control points in Hugin (https://hugin.sourceforge.io/) and saved to control_points.pto, and the corresponding measurements of their coordinates were saved in xyz.txt. See Section 16.59.11 for more details.

The nav_cam camera is chosen to be the reference sensor in the rig configuration. Its poses are allowed to float, that is, to be optimized (--camera_poses_to_float), and the rig transforms from this one to the other ones are floated as well, when passed in via the same option. The scale of depth clouds is floated as well (--float_scale).

Here we chose to optimize the rig while keeping the intrinsics fixed. Floating the intrinsics, especially the distortion parameters, requires many interest point matches, especially towards image boundary, and can make the problem less stable. If desired to float them, one can replace float_intr="" with:

float_intr="nav_cam:${intr} haz_cam:${intr} sci_cam:${intr}"

which will be passed above to the option --intrinsics_to_float.

In this particular case, the real-world scale (but not orientation) would have been solved for correctly even without registration, as it would be inferred from the depth clouds.

Since the nav_cam camera has a wide field of view, the values in distorted_crop_size in the rig configuration are smaller than actual image dimensions to reduce the worst effects of peripheral distortion.

One could pass in --num_overlaps 10 to get more interest point matches than what Theia finds, but this is usually not necessary. This number better be kept small, especially if the features are poor, as it may result in many outliers among images that do not match well.

The value of --bracket_len should be a little larger than the differences (in seconds) between the image times for which it is desired to do pose interpolation in time. The option --bracket_single_image is used to ensure that just a single non-reference image is used for each time bracket (useful if there are too many images).

The options --save_pinhole_cameras and --save_matches can be employed to save the pinhole cameras and the interest point matches in formats understood by bundle_adjust (Section 16.5) and stereo_gui (Section, respectively.

See Section 16.59.16 for the full list of options.

The obtained point clouds can be fused into a mesh using voxblox_mesh (Section 16.73), using the command:

voxblox_mesh --index rig_out/voxblox/haz_cam/index.txt \
  --output_mesh rig_out/fused_mesh.ply                 \
  --min_ray_length 0.1 --max_ray_length 4.0            \
  --voxel_size 0.01

This assumes that depth sensors were present. Otherwise, can needs to create point clouds with stereo, see Section 16.42.

The output mesh is fused_mesh.ply, points no further than 2 meters from each camera center are used, and the mesh is obtained after binning the points into voxels of 1 cm in size.

Full-resolution textured meshes can be obtained by projecting and fusing the images for each sensor with texrecon (Section 16.69):

for cam in nav_cam sci_cam; do
  texrecon --rig_config rig_out/rig_config.txt \
    --camera_poses rig_out/cameras.txt         \
    --mesh rig_out/fused_mesh.ply              \
    --rig_sensor ${cam}                        \
    --undistorted_crop_win '1000 800'          \
    --out_dir rig_out/texture

The obtained textured meshes can be inspected for disagreements, by loading them in MeshLab, as:

meshlab rig_out/fused_mesh.ply        \
  rig_out/texture/nav_cam/texture.obj \

See an illustration in Fig. 16.27. See a larger example in Section 9.2, using two rigs.