ns-process-data#

Note

Make sure to have COLMAP and FFmpeg installed.
You may also want to install hloc (optional) for more feature detector and matcher options.

usage: ns-process-data [-h]
                       {images,video,polycam,metashape,realitycapture,record3d
,odm}

subcommands#

{images,video,polycam,metashape,realitycapture,record3d,odm}

Possible choices: images, video, polycam, metashape, realitycapture, record3d, odm

Sub-commands:#

images#

Process images into a nerfstudio dataset. 1. Scales images to a specified size. 2. Calculates the camera poses for each image using COLMAP.

ns-process-data images [-h] --data PATH --output-dir PATH
                       [--eval-data {None}|PATH] [--verbose | --no-verbose]
                       [--camera-type {perspective,fisheye,equirectangular}]
                       [--matching-method {exhaustive,sequential,vocab_tree}]
                       [--sfm-tool {any,colmap,hloc}]
                       [--refine-pixsfm | --no-refine-pixsfm]
                       [--refine-intrinsics | --no-refine-intrinsics]
                       [--feature-type 
{any,sift,superpoint,superpoint_aachen,superpoint_max,superpoint_inloc,r2d2,d2
net-ss,sosnet,disk}]
                       [--matcher-type 
{any,NN,superglue,superglue-fast,NN-superpoint,NN-ratio,NN-mutual,adalam}]
                       [--num-downscales INT]
                       [--skip-colmap | --no-skip-colmap]
                       [--skip-image-processing | --no-skip-image-processing]
                       [--colmap-model-path PATH] [--colmap-cmd STR]
                       [--images-per-equirect {8,14}]
                       [--crop-factor FLOAT FLOAT FLOAT FLOAT]
                       [--crop-bottom FLOAT] [--gpu | --no-gpu]
                       [--use-sfm-depth | --no-use-sfm-depth]
                       [--include-depth-debug | --no-include-depth-debug]
                       [--same-dimensions | --no-same-dimensions]
                       [--percent-radius-crop FLOAT]

arguments#

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--camera-type

Possible choices: perspective, fisheye, equirectangular

Camera model to use. (default: perspective)

--matching-method

Possible choices: exhaustive, vocab_tree, sequential

Feature matching method to use. Vocab tree is recommended for a balance of speed and accuracy. Exhaustive is slower but more accurate. Sequential is faster but should only be used for videos. (default: vocab_tree)

--sfm-tool

Possible choices: hloc, colmap, any

Structure from motion tool to use. Colmap will use sift features, hloc can use many modern methods such as superpoint features and superglue matcher (default: any)

--refine-pixsfm, --no-refine-pixsfm

If True, runs refinement using Pixel Perfect SFM. Only works with hloc sfm_tool (default: False)

--refine-intrinsics, --no-refine-intrinsics

If True, do bundle adjustment to refine intrinsics. Only works with colmap sfm_tool (default: True)

--feature-type

Possible choices: superpoint_inloc, r2d2, sift, disk, sosnet, superpoint_aachen, d2net-ss, superpoint_max, any, superpoint

Type of feature to use. (default: any)

--matcher-type

Possible choices: adalam, superglue-fast, NN, NN-mutual, NN-superpoint, superglue, any, NN-ratio

Matching algorithm. (default: any)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--skip-colmap, --no-skip-colmap

If True, skips COLMAP and generates transforms.json if possible. (default: False)

--skip-image-processing, --no-skip-image-processing

If True, skips copying and downscaling of images and only runs COLMAP if possible and enabled (default: False)

--colmap-model-path

Optionally sets the path of the colmap model. Used only when –skip-colmap is set to True. The path is relative to the output directory. (default: colmap/sparse/0)

--colmap-cmd

How to call the COLMAP executable. (default: colmap)

--images-per-equirect

Possible choices: 14, 8

Number of samples per image to take from each equirectangular image. Used only when camera-type is equirectangular. (default: 8)

--crop-factor

Portion of the image to crop. All values should be in [0,1]. (top, bottom, left, right) (default: 0.0 0.0 0.0 0.0)

--crop-bottom

Portion of the image to crop from the bottom. Can be used instead of crop-factor 0.0 [num] 0.0 0.0 Should be in [0,1]. (default: 0.0)

--gpu, --no-gpu

If True, use GPU. (default: True)

--use-sfm-depth, --no-use-sfm-depth

If True, export and use depth maps induced from SfM points. (default: False)

--include-depth-debug, --no-include-depth-debug

If –use-sfm-depth and this flag is True, also export debug images showing Sf overlaid upon input images. (default: False)

--same-dimensions, --no-same-dimensions

Whether to assume all images are same dimensions and so to use fast downscaling with no autorotation. (default: True)

--percent-radius-crop

Create circle crop mask. The radius is the percent of the image diagonal. (default: 1.0)

video#

Process videos into a nerfstudio dataset. This script does the following:

  1. Converts the video into images and downscales them.

  2. Calculates the camera poses for each image using COLMAP.

ns-process-data video [-h] --data PATH --output-dir PATH
                      [--eval-data {None}|PATH] [--verbose | --no-verbose]
                      [--camera-type {perspective,fisheye,equirectangular}]
                      [--matching-method {exhaustive,sequential,vocab_tree}]
                      [--sfm-tool {any,colmap,hloc}]
                      [--refine-pixsfm | --no-refine-pixsfm]
                      [--refine-intrinsics | --no-refine-intrinsics]
                      [--feature-type 
{any,sift,superpoint,superpoint_aachen,superpoint_max,superpoint_inloc,r2d2,d2
net-ss,sosnet,disk}]
                      [--matcher-type 
{any,NN,superglue,superglue-fast,NN-superpoint,NN-ratio,NN-mutual,adalam}]
                      [--num-downscales INT]
                      [--skip-colmap | --no-skip-colmap]
                      [--skip-image-processing | --no-skip-image-processing]
                      [--colmap-model-path PATH] [--colmap-cmd STR]
                      [--images-per-equirect {8,14}]
                      [--crop-factor FLOAT FLOAT FLOAT FLOAT]
                      [--crop-bottom FLOAT] [--gpu | --no-gpu]
                      [--use-sfm-depth | --no-use-sfm-depth]
                      [--include-depth-debug | --no-include-depth-debug]
                      [--same-dimensions | --no-same-dimensions]
                      [--num-frames-target INT] [--percent-radius-crop FLOAT]

arguments#

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--camera-type

Possible choices: perspective, fisheye, equirectangular

Camera model to use. (default: perspective)

--matching-method

Possible choices: exhaustive, vocab_tree, sequential

Feature matching method to use. Vocab tree is recommended for a balance of speed and accuracy. Exhaustive is slower but more accurate. Sequential is faster but should only be used for videos. (default: vocab_tree)

--sfm-tool

Possible choices: hloc, colmap, any

Structure from motion tool to use. Colmap will use sift features, hloc can use many modern methods such as superpoint features and superglue matcher (default: any)

--refine-pixsfm, --no-refine-pixsfm

If True, runs refinement using Pixel Perfect SFM. Only works with hloc sfm_tool (default: False)

--refine-intrinsics, --no-refine-intrinsics

If True, do bundle adjustment to refine intrinsics. Only works with colmap sfm_tool (default: True)

--feature-type

Possible choices: superpoint_inloc, r2d2, sift, disk, sosnet, superpoint_aachen, d2net-ss, superpoint_max, any, superpoint

Type of feature to use. (default: any)

--matcher-type

Possible choices: adalam, superglue-fast, NN, NN-mutual, NN-superpoint, superglue, any, NN-ratio

Matching algorithm. (default: any)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--skip-colmap, --no-skip-colmap

If True, skips COLMAP and generates transforms.json if possible. (default: False)

--skip-image-processing, --no-skip-image-processing

If True, skips copying and downscaling of images and only runs COLMAP if possible and enabled (default: False)

--colmap-model-path

Optionally sets the path of the colmap model. Used only when –skip-colmap is set to True. The path is relative to the output directory. (default: colmap/sparse/0)

--colmap-cmd

How to call the COLMAP executable. (default: colmap)

--images-per-equirect

Possible choices: 14, 8

Number of samples per image to take from each equirectangular image. Used only when camera-type is equirectangular. (default: 8)

--crop-factor

Portion of the image to crop. All values should be in [0,1]. (top, bottom, left, right) (default: 0.0 0.0 0.0 0.0)

--crop-bottom

Portion of the image to crop from the bottom. Can be used instead of crop-factor 0.0 [num] 0.0 0.0 Should be in [0,1]. (default: 0.0)

--gpu, --no-gpu

If True, use GPU. (default: True)

--use-sfm-depth, --no-use-sfm-depth

If True, export and use depth maps induced from SfM points. (default: False)

--include-depth-debug, --no-include-depth-debug

If –use-sfm-depth and this flag is True, also export debug images showing Sf overlaid upon input images. (default: False)

--same-dimensions, --no-same-dimensions

Whether to assume all images are same dimensions and so to use fast downscaling with no autorotation. (default: True)

--num-frames-target

Target number of frames to use per video, results may not be exact. (default: 300)

--percent-radius-crop

Create circle crop mask. The radius is the percent of the image diagonal. (default: 1.0)

polycam#

Process Polycam data into a nerfstudio dataset. To capture data, use the Polycam app on an iPhone or iPad with LiDAR. The capture must be in LiDAR or ROOM mode. Developer mode must be enabled in the app settings, this will enable a raw data export option in the export menus. The exported data folder is used as the input to this script.

This script does the following:

  1. Scales images to a specified size.

  2. Converts Polycam poses into the nerfstudio format.

ns-process-data polycam [-h] --data PATH --output-dir PATH
                        [--eval-data {None}|PATH] [--verbose | --no-verbose]
                        [--num-downscales INT]
                        [--use-uncorrected-images | 
--no-use-uncorrected-images]
                        [--max-dataset-size INT] [--min-blur-score FLOAT]
                        [--crop-border-pixels INT]
                        [--use-depth | --no-use-depth]

arguments#

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--use-uncorrected-images, --no-use-uncorrected-images

If True, use the raw images from the polycam export. If False, use the corrected images. (default: False)

--max-dataset-size

Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)

--min-blur-score

Minimum blur score to use an image. If the blur score is below this value, the image will be skipped. (default: 25)

--crop-border-pixels

Number of pixels to crop from each border of the image. Useful as borders may be black due to undistortion. (default: 15)

--use-depth, --no-use-depth

If True, processes the generated depth maps from Polycam (default: False)

metashape#

Process Metashape data into a nerfstudio dataset. This script assumes that cameras have been aligned using Metashape. After alignment, it is necessary to export the camera poses as a .xml file. This option can be found under File > Export > Export Cameras.

This script does the following:

  1. Scales images to a specified size.

  2. Converts Metashape poses into the nerfstudio format.

ns-process-data metashape [-h] --xml PATH --data PATH --output-dir PATH
                          [--eval-data {None}|PATH] [--verbose | --no-verbose]
                          [--num-downscales INT] [--max-dataset-size INT]

arguments#

--xml

Path to the Metashape xml file. (required)

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--max-dataset-size

Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)

realitycapture#

Process RealityCapture data into a nerfstudio dataset. This script assumes that cameras have been aligned using RealityCapture. After alignment, it is necessary to export the camera poses as a .csv file using the Internal/External camera parameters option.

This script does the following:

  1. Scales images to a specified size.

  2. Converts RealityCapture poses into the nerfstudio format.

ns-process-data realitycapture [-h] --csv PATH --data PATH --output-dir PATH
                               [--eval-data {None}|PATH]
                               [--verbose | --no-verbose]
                               [--num-downscales INT] [--max-dataset-size INT]

arguments#

--csv

Path to the RealityCapture cameras CSV file. (required)

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--max-dataset-size

Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)

record3d#

Process Record3D data into a nerfstudio dataset. This script does the following:

  1. Scales images to a specified size.

  2. Converts Record3D poses into the nerfstudio format.

ns-process-data record3d [-h] --data PATH --output-dir PATH
                         [--eval-data {None}|PATH] [--verbose | --no-verbose]
                         [--num-downscales INT] [--max-dataset-size INT]

arguments#

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--max-dataset-size

Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 300)

odm#

Process ODM data into a nerfstudio dataset. This script does the following:

  1. Scales images to a specified size.

  2. Converts ODM poses into the nerfstudio format.

ns-process-data odm [-h] --data PATH --output-dir PATH
                    [--eval-data {None}|PATH] [--verbose | --no-verbose]
                    [--num-downscales INT] [--max-dataset-size INT]

arguments#

--data

Path the data, either a video file or a directory of images. (required)

--output-dir

Path to the output directory. (required)

--eval-data

Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)

--verbose, --no-verbose

If True, print extra logging. (default: False)

--num-downscales

Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)

--max-dataset-size

Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)