ns-process-data#
Note
Make sure to have COLMAP and FFmpeg installed.
You may also want to install hloc (optional) for more feature detector and matcher options.
usage: ns-process-data [-h]
{images,video,polycam,metashape,realitycapture,record3d
,odm}
subcommands#
- {images,video,polycam,metashape,realitycapture,record3d,odm}
Possible choices: images, video, polycam, metashape, realitycapture, record3d, odm
Sub-commands:#
images#
Process images into a nerfstudio dataset. 1. Scales images to a specified size. 2. Calculates the camera poses for each image using COLMAP.
ns-process-data images [-h] --data PATH --output-dir PATH
[--eval-data {None}|PATH] [--verbose | --no-verbose]
[--camera-type {perspective,fisheye,equirectangular}]
[--matching-method {exhaustive,sequential,vocab_tree}]
[--sfm-tool {any,colmap,hloc}]
[--refine-pixsfm | --no-refine-pixsfm]
[--refine-intrinsics | --no-refine-intrinsics]
[--feature-type
{any,sift,superpoint,superpoint_aachen,superpoint_max,superpoint_inloc,r2d2,d2
net-ss,sosnet,disk}]
[--matcher-type
{any,NN,superglue,superglue-fast,NN-superpoint,NN-ratio,NN-mutual,adalam}]
[--num-downscales INT]
[--skip-colmap | --no-skip-colmap]
[--skip-image-processing | --no-skip-image-processing]
[--colmap-model-path PATH] [--colmap-cmd STR]
[--images-per-equirect {8,14}]
[--crop-factor FLOAT FLOAT FLOAT FLOAT]
[--crop-bottom FLOAT] [--gpu | --no-gpu]
[--use-sfm-depth | --no-use-sfm-depth]
[--include-depth-debug | --no-include-depth-debug]
[--same-dimensions | --no-same-dimensions]
[--percent-radius-crop FLOAT]
arguments#
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --camera-type
Possible choices: perspective, fisheye, equirectangular
Camera model to use. (default: perspective)
- --matching-method
Possible choices: exhaustive, vocab_tree, sequential
Feature matching method to use. Vocab tree is recommended for a balance of speed and accuracy. Exhaustive is slower but more accurate. Sequential is faster but should only be used for videos. (default: vocab_tree)
- --sfm-tool
Possible choices: hloc, colmap, any
Structure from motion tool to use. Colmap will use sift features, hloc can use many modern methods such as superpoint features and superglue matcher (default: any)
- --refine-pixsfm, --no-refine-pixsfm
If True, runs refinement using Pixel Perfect SFM. Only works with hloc sfm_tool (default: False)
- --refine-intrinsics, --no-refine-intrinsics
If True, do bundle adjustment to refine intrinsics. Only works with colmap sfm_tool (default: True)
- --feature-type
Possible choices: superpoint_inloc, r2d2, sift, disk, sosnet, superpoint_aachen, d2net-ss, superpoint_max, any, superpoint
Type of feature to use. (default: any)
- --matcher-type
Possible choices: adalam, superglue-fast, NN, NN-mutual, NN-superpoint, superglue, any, NN-ratio
Matching algorithm. (default: any)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --skip-colmap, --no-skip-colmap
If True, skips COLMAP and generates transforms.json if possible. (default: False)
- --skip-image-processing, --no-skip-image-processing
If True, skips copying and downscaling of images and only runs COLMAP if possible and enabled (default: False)
- --colmap-model-path
Optionally sets the path of the colmap model. Used only when –skip-colmap is set to True. The path is relative to the output directory. (default: colmap/sparse/0)
- --colmap-cmd
How to call the COLMAP executable. (default: colmap)
- --images-per-equirect
Possible choices: 14, 8
Number of samples per image to take from each equirectangular image. Used only when camera-type is equirectangular. (default: 8)
- --crop-factor
Portion of the image to crop. All values should be in [0,1]. (top, bottom, left, right) (default: 0.0 0.0 0.0 0.0)
- --crop-bottom
Portion of the image to crop from the bottom. Can be used instead of crop-factor 0.0 [num] 0.0 0.0 Should be in [0,1]. (default: 0.0)
- --gpu, --no-gpu
If True, use GPU. (default: True)
- --use-sfm-depth, --no-use-sfm-depth
If True, export and use depth maps induced from SfM points. (default: False)
- --include-depth-debug, --no-include-depth-debug
If –use-sfm-depth and this flag is True, also export debug images showing Sf overlaid upon input images. (default: False)
- --same-dimensions, --no-same-dimensions
Whether to assume all images are same dimensions and so to use fast downscaling with no autorotation. (default: True)
- --percent-radius-crop
Create circle crop mask. The radius is the percent of the image diagonal. (default: 1.0)
video#
Process videos into a nerfstudio dataset. This script does the following:
Converts the video into images and downscales them.
Calculates the camera poses for each image using COLMAP.
ns-process-data video [-h] --data PATH --output-dir PATH
[--eval-data {None}|PATH] [--verbose | --no-verbose]
[--camera-type {perspective,fisheye,equirectangular}]
[--matching-method {exhaustive,sequential,vocab_tree}]
[--sfm-tool {any,colmap,hloc}]
[--refine-pixsfm | --no-refine-pixsfm]
[--refine-intrinsics | --no-refine-intrinsics]
[--feature-type
{any,sift,superpoint,superpoint_aachen,superpoint_max,superpoint_inloc,r2d2,d2
net-ss,sosnet,disk}]
[--matcher-type
{any,NN,superglue,superglue-fast,NN-superpoint,NN-ratio,NN-mutual,adalam}]
[--num-downscales INT]
[--skip-colmap | --no-skip-colmap]
[--skip-image-processing | --no-skip-image-processing]
[--colmap-model-path PATH] [--colmap-cmd STR]
[--images-per-equirect {8,14}]
[--crop-factor FLOAT FLOAT FLOAT FLOAT]
[--crop-bottom FLOAT] [--gpu | --no-gpu]
[--use-sfm-depth | --no-use-sfm-depth]
[--include-depth-debug | --no-include-depth-debug]
[--same-dimensions | --no-same-dimensions]
[--num-frames-target INT] [--percent-radius-crop FLOAT]
arguments#
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --camera-type
Possible choices: perspective, fisheye, equirectangular
Camera model to use. (default: perspective)
- --matching-method
Possible choices: exhaustive, vocab_tree, sequential
Feature matching method to use. Vocab tree is recommended for a balance of speed and accuracy. Exhaustive is slower but more accurate. Sequential is faster but should only be used for videos. (default: vocab_tree)
- --sfm-tool
Possible choices: hloc, colmap, any
Structure from motion tool to use. Colmap will use sift features, hloc can use many modern methods such as superpoint features and superglue matcher (default: any)
- --refine-pixsfm, --no-refine-pixsfm
If True, runs refinement using Pixel Perfect SFM. Only works with hloc sfm_tool (default: False)
- --refine-intrinsics, --no-refine-intrinsics
If True, do bundle adjustment to refine intrinsics. Only works with colmap sfm_tool (default: True)
- --feature-type
Possible choices: superpoint_inloc, r2d2, sift, disk, sosnet, superpoint_aachen, d2net-ss, superpoint_max, any, superpoint
Type of feature to use. (default: any)
- --matcher-type
Possible choices: adalam, superglue-fast, NN, NN-mutual, NN-superpoint, superglue, any, NN-ratio
Matching algorithm. (default: any)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --skip-colmap, --no-skip-colmap
If True, skips COLMAP and generates transforms.json if possible. (default: False)
- --skip-image-processing, --no-skip-image-processing
If True, skips copying and downscaling of images and only runs COLMAP if possible and enabled (default: False)
- --colmap-model-path
Optionally sets the path of the colmap model. Used only when –skip-colmap is set to True. The path is relative to the output directory. (default: colmap/sparse/0)
- --colmap-cmd
How to call the COLMAP executable. (default: colmap)
- --images-per-equirect
Possible choices: 14, 8
Number of samples per image to take from each equirectangular image. Used only when camera-type is equirectangular. (default: 8)
- --crop-factor
Portion of the image to crop. All values should be in [0,1]. (top, bottom, left, right) (default: 0.0 0.0 0.0 0.0)
- --crop-bottom
Portion of the image to crop from the bottom. Can be used instead of crop-factor 0.0 [num] 0.0 0.0 Should be in [0,1]. (default: 0.0)
- --gpu, --no-gpu
If True, use GPU. (default: True)
- --use-sfm-depth, --no-use-sfm-depth
If True, export and use depth maps induced from SfM points. (default: False)
- --include-depth-debug, --no-include-depth-debug
If –use-sfm-depth and this flag is True, also export debug images showing Sf overlaid upon input images. (default: False)
- --same-dimensions, --no-same-dimensions
Whether to assume all images are same dimensions and so to use fast downscaling with no autorotation. (default: True)
- --num-frames-target
Target number of frames to use per video, results may not be exact. (default: 300)
- --percent-radius-crop
Create circle crop mask. The radius is the percent of the image diagonal. (default: 1.0)
polycam#
Process Polycam data into a nerfstudio dataset. To capture data, use the Polycam app on an iPhone or iPad with LiDAR. The capture must be in LiDAR or ROOM mode. Developer mode must be enabled in the app settings, this will enable a raw data export option in the export menus. The exported data folder is used as the input to this script.
This script does the following:
Scales images to a specified size.
Converts Polycam poses into the nerfstudio format.
ns-process-data polycam [-h] --data PATH --output-dir PATH
[--eval-data {None}|PATH] [--verbose | --no-verbose]
[--num-downscales INT]
[--use-uncorrected-images |
--no-use-uncorrected-images]
[--max-dataset-size INT] [--min-blur-score FLOAT]
[--crop-border-pixels INT]
[--use-depth | --no-use-depth]
arguments#
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --use-uncorrected-images, --no-use-uncorrected-images
If True, use the raw images from the polycam export. If False, use the corrected images. (default: False)
- --max-dataset-size
Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)
- --min-blur-score
Minimum blur score to use an image. If the blur score is below this value, the image will be skipped. (default: 25)
- --crop-border-pixels
Number of pixels to crop from each border of the image. Useful as borders may be black due to undistortion. (default: 15)
- --use-depth, --no-use-depth
If True, processes the generated depth maps from Polycam (default: False)
metashape#
Process Metashape data into a nerfstudio dataset. This script assumes that cameras have been aligned using Metashape. After alignment, it is necessary to export the camera poses as a .xml file. This option can be found under File > Export > Export Cameras.
This script does the following:
Scales images to a specified size.
Converts Metashape poses into the nerfstudio format.
ns-process-data metashape [-h] --xml PATH --data PATH --output-dir PATH
[--eval-data {None}|PATH] [--verbose | --no-verbose]
[--num-downscales INT] [--max-dataset-size INT]
arguments#
- --xml
Path to the Metashape xml file. (required)
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --max-dataset-size
Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)
realitycapture#
Process RealityCapture data into a nerfstudio dataset. This script assumes that cameras have been aligned using RealityCapture. After alignment, it is necessary to export the camera poses as a .csv file using the Internal/External camera parameters option.
This script does the following:
Scales images to a specified size.
Converts RealityCapture poses into the nerfstudio format.
ns-process-data realitycapture [-h] --csv PATH --data PATH --output-dir PATH
[--eval-data {None}|PATH]
[--verbose | --no-verbose]
[--num-downscales INT] [--max-dataset-size INT]
arguments#
- --csv
Path to the RealityCapture cameras CSV file. (required)
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --max-dataset-size
Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)
record3d#
Process Record3D data into a nerfstudio dataset. This script does the following:
Scales images to a specified size.
Converts Record3D poses into the nerfstudio format.
ns-process-data record3d [-h] --data PATH --output-dir PATH
[--eval-data {None}|PATH] [--verbose | --no-verbose]
[--num-downscales INT] [--max-dataset-size INT]
arguments#
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --max-dataset-size
Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 300)
odm#
Process ODM data into a nerfstudio dataset. This script does the following:
Scales images to a specified size.
Converts ODM poses into the nerfstudio format.
ns-process-data odm [-h] --data PATH --output-dir PATH
[--eval-data {None}|PATH] [--verbose | --no-verbose]
[--num-downscales INT] [--max-dataset-size INT]
arguments#
- --data
Path the data, either a video file or a directory of images. (required)
- --output-dir
Path to the output directory. (required)
- --eval-data
Path the eval data, either a video file or a directory of images. If set to None, the first will be used both for training and eval (default: None)
- --verbose, --no-verbose
If True, print extra logging. (default: False)
- --num-downscales
Number of times to downscale the images. Downscales by 2 each time. For example a value of 3 will downscale the images by 2x, 4x, and 8x. (default: 3)
- --max-dataset-size
Max number of images to train on. If the dataset has more, images will be sampled approximately evenly. If -1, use all images. (default: 600)