9.3. MSL navcam example

This is an example of using the ASP tools to process images taken by the Mars Science Laboratory (MSL) rover Curiosity. See Section 9 for other examples.

This approach uses only the images to create a self-consistent solution, without placing it in the right location on the Mars surface. Section Section 8.12.5 discusses using the known camera poses for MSL.

9.3.1. Illustration

MSL Kimberly mesh
MSL Kimberly photo

Fig. 9.2 A mesh created with 22 MSL navcam images acquired on SOL 597 (top), and several representative images from this set (bottom).

9.3.2. Sensor information

Curiosity has two navcam sensors (left and right) mounted on a stereo rig. Each records images at a resolution of 1024 x 1024 pixels. The field of view is 45 degrees.

9.3.3. Challenges

The navcam images are used to plan the path of the rover. They are not acquired specifically for mapping.

While there is good overlap and perspective difference between images that are taken at the same time with the stereo rig, these assumptions may not hold for images produced at different times. Moreover, after the rover changes position, there is usually a large perspective difference and little overlap with earlier images.

A very useful reference on processing MSL images is [CLMouelicM+20]. It uses the commercial Agisoft Photoscan software.

To help with matching the images, this paper uses the global position and orientation of each image and projects these onto the ground. Such data is not fully present in the .LBL files in PDS, as those contain only local coordinates, and would necessitate queering the SPICE database. It also incorporates lower-resolution “TRAV” images to tie the data together.

In the current example only a small set of data from a single day is used.

9.3.4. Data preparation

The images are fetched from PDS. For example, to get the data for day (SOL) 597 on Mars, use the command:

wget -r -nH --cut-dirs=4 --no-parent    \
  --reject="index.html*"                \
  https://pds-imaging.jpl.nasa.gov/$dir \
  --include $dir

This will create the directory SOL00597 containing .IMG data files and .LBL metadata. Using the ISIS pds2isis program (see Section 2.1.1 for installation), these can be converted to .cub files as:

pds2isis from = SOL00597/image.LBL to = SOL00597/image.cub

A .cub file obtained with the left navcam sensor will have a name like:


while for the right sensor the prefix will be instead NRB. The full-resolution images have _F as part of their name, as above.

We will convert the .cub files to the PNG format so that they can be understood by image-processing programs. The rig_calibrator convention will be used, of storing each sensor’s data in its own subdirectory (Section 16.57.2). We will name the left and right navcam sensors lnav and rnav. Then, the conversion commands are along the lines of:

mkdir -p SOL00597/lnav
isis2std from = SOL00597/left_image.cub \
  to = SOL00597/lnav/left_image.png

9.3.5. Image selection

A subset of 22 images was selected for SOL 597 (half for each of the left and right navcam sensors). Images were chosen based on visual inspection. A fully automatic approach may be challenging (Section 9.3.3).

This dataset is available for download.

9.3.6. Setting up the initial rig

Given the earlier sensor information, the focal length can be found using the formula:

\[f = \frac{w}{2\tan(\frac{\theta}{2})}\]

where \(w\) is sensor width in pixels and \(\theta\) is the field of view. The focal length is then about 1236.0773 pixels. We will start by assuming that the optical center is at the image center, and no distortion. Hence, the initial rig configuration (Section 16.57.5) will look like:

ref_sensor_name: lnav

sensor_name: lnav
focal_length:  1236.0773
optical_center: 512 512
distortion_type: no_distortion
image_size: 1024 1024
distorted_crop_size: 1024 1024
undistorted_image_size: 1024 1024
ref_to_sensor_transform: 1 0 0 0 1 0 0 0 1 0 0 0
depth_to_image_transform: 1 0 0 0 1 0 0 0 1 0 0 0
ref_to_sensor_timestamp_offset: 0

with an additional identical block for the rnav sensor (without ref_sensor_name).

9.3.7. SfM map creation

Given the data and rig configuration, the image names in .png format were put in a list, with one entry per line. The theia_sfm program (Section 16.68) was run to find initial camera poses:

theia_sfm                     \
  --rig_config rig_config.txt \
  --image_list list.txt       \
  --out_dir theia_rig

Next, rig_calibrator (Section 16.57) is used, to enforce the rig constraint between the left and right navcam sensors and refine the intrinsics:

float="lnav:${params} rnav:${params}"

rig_calibrator                        \
  --rig_config rig_config.txt         \
  --nvm theia_rig/cameras.nvm         \
  --camera_poses_to_float "lnav rnav" \
  --intrinsics_to_float "$float"      \
  --num_iterations 30                 \
  --calibrator_num_passes 2           \
  --num_overlaps 5                    \
  --robust_threshold 3                \
  --out_dir rig_out

Here, --robust_threshold was increased from the default value of 0.5 to focus more on larger errors. To optimize the distortion, one can adjust the rig configuration by setting initial distortion values and type:

distortion_coeffs: 1e-10 1e-10 1e-10 1e-10 1e-10
distortion_type: radtan

and then defining the list of parameters to optimize as:


For this example, plausible solutions were obtained with and without using distortion modeling, but likely for creation of pixel-level registered textured meshes handling distortion is important.

The produced pairwise matches in rig_out/cameras.nvm can be inspected with stereo_gui (Section

9.3.8. Mesh creation

Here, a point cloud is created from every stereo pair consisting of a left sensor image and corresponding right image, and those are fused into a mesh. Some parameters are set up first.

Stereo options (Section 17):

  --stereo-algorithm asp_mgm
  --alignment-method affineepipolar
  --ip-per-image 10000
  --min-triangulation-angle 0.1
  --global-alignment-threshold 5
  --session nadirpinhole
  --corr-seed-mode 1
  --corr-tile-size 5000
  --max-disp-spread 300
  --ip-inlier-factor 0.4
  --nodata-value 0"

Point cloud filter options (Section 16.52):

  --max-camera-ray-to-surface-normal-angle 85
  --max-valid-triangulation-error 10.0
  --max-distance-from-camera $maxDistanceFromCamera
  --blending-dist 50 --blending-power 1"

Mesh generation options (Section 16.70):

  --min_ray_length 0.1
  --max_ray_length $maxDistanceFromCamera
  --voxel_size 0.05"

Set up the pairs to run stereo on:

mkdir -p ${outDir}
grep lnav list.txt > ${outDir}/left.txt
grep rnav list.txt > ${outDir}/right.txt

The optimized rig, in rig_out/rig_config.txt, and optimized cameras, in rig_out/cameras.txt, are passed to multi_stereo (Section 16.40):

multi_stereo                              \
  --rig_config rig_out/rig_config.txt     \
  --camera_poses rig_out/cameras.txt      \
  --undistorted_crop_win '1100 1100'      \
  --rig_sensor "lnav rnav"                \
  --first_step stereo                     \
  --last_step mesh_gen                    \
  --stereo_options "$stereo_opts"         \
  --pc_filter_options "$pc_filter_opts"   \
  --mesh_gen_options "$mesh_gen_opts"     \
  --left ${outDir}/left.txt               \
  --right ${outDir}/right.txt             \
  --out_dir ${outDir}

This created:


See the produced mesh in Section 9.3.1.

9.3.9. Notes

  • No ground registration was done, so neither the scale nor the pose of the produced mesh is accurate. The mesh is, however, self-consistent.

  • The voxel size for binning and meshing the point cloud was chosen manually. An automated approach for choosing a representative voxel size is to be implemented.

  • The multi_stereo tool does not use the interest points found during SfM map construction. That would likely result in a good speedup. It also does not run the stereo pairs in parallel.