Utils#

Base#

This file copied with small modifications from:

https://github.com/colmap/colmap/blob/1a4d0bad2e90aa65ce997c9d1779518eaed998d5/scripts/python/read_write_model.py

TODO(1480) Delete this file when moving to pycolmap.

nerfstudio.data.utils.colmap_parsing_utils.BaseImage#: alias of Image

class nerfstudio.data.utils.colmap_parsing_utils.Camera(id, model, width, height, params)#

height#: Alias for field number 3

id#: Alias for field number 0

model#: Alias for field number 1

params#: Alias for field number 4

width#: Alias for field number 2

class nerfstudio.data.utils.colmap_parsing_utils.CameraModel(model_id, model_name, num_params)#

model_id#: Alias for field number 0

model_name#: Alias for field number 1

num_params#: Alias for field number 2

class nerfstudio.data.utils.colmap_parsing_utils.Image(id, qvec, tvec, camera_id, name, xys, point3D_ids)[source]#

class nerfstudio.data.utils.colmap_parsing_utils.Point3D(id, xyz, rgb, error, image_ids, point2D_idxs)#

error#: Alias for field number 3

id#: Alias for field number 0

image_ids#: Alias for field number 4

point2D_idxs#: Alias for field number 5

rgb#: Alias for field number 2

xyz#: Alias for field number 1

nerfstudio.data.utils.colmap_parsing_utils.read_cameras_binary(path_to_model_file)[source]#

see: src/base/reconstruction.cc: void Reconstruction::WriteCamerasBinary(const std::string& path) void Reconstruction::ReadCamerasBinary(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.read_cameras_text(path)[source]#

see: src/base/reconstruction.cc: void Reconstruction::WriteCamerasText(const std::string& path) void Reconstruction::ReadCamerasText(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.read_images_binary(path_to_model_file)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadImagesBinary(const std::string& path) void Reconstruction::WriteImagesBinary(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.read_images_text(path)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadImagesText(const std::string& path) void Reconstruction::WriteImagesText(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.read_next_bytes(fid, num_bytes, format_char_sequence, endian_character='<')[source]#: Read and unpack the next bytes from a binary file. :param fid: :param num_bytes: Sum of combination of {2, 4, 8}, e.g. 2, 6, 16, 30, etc. :param format_char_sequence: List of {c, e, f, d, h, H, i, I, l, L, q, Q}. :param endian_character: Any of {@, =, <, >, !} :return: Tuple of read and unpacked values.

nerfstudio.data.utils.colmap_parsing_utils.read_points3D_binary(path_to_model_file)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadPoints3DBinary(const std::string& path) void Reconstruction::WritePoints3DBinary(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.read_points3D_text(path)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadPoints3DText(const std::string& path) void Reconstruction::WritePoints3DText(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.write_cameras_binary(cameras, path_to_model_file)[source]#

see: src/base/reconstruction.cc: void Reconstruction::WriteCamerasBinary(const std::string& path) void Reconstruction::ReadCamerasBinary(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.write_cameras_text(cameras, path)[source]#

see: src/base/reconstruction.cc: void Reconstruction::WriteCamerasText(const std::string& path) void Reconstruction::ReadCamerasText(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.write_images_binary(images, path_to_model_file)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadImagesBinary(const std::string& path) void Reconstruction::WriteImagesBinary(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.write_images_text(images, path)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadImagesText(const std::string& path) void Reconstruction::WriteImagesText(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.write_next_bytes(fid, data, format_char_sequence, endian_character='<')[source]#: pack and write to a binary file. :param fid: :param data: data to send, if multiple elements are sent at the same time, they should be encapsuled either in a list or a tuple :param format_char_sequence: List of {c, e, f, d, h, H, i, I, l, L, q, Q}. should be the same length as the data list or tuple :param endian_character: Any of {@, =, <, >, !}

nerfstudio.data.utils.colmap_parsing_utils.write_points3D_binary(points3D, path_to_model_file)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadPoints3DBinary(const std::string& path) void Reconstruction::WritePoints3DBinary(const std::string& path)

nerfstudio.data.utils.colmap_parsing_utils.write_points3D_text(points3D, path)[source]#

see: src/base/reconstruction.cc: void Reconstruction::ReadPoints3DText(const std::string& path) void Reconstruction::WritePoints3DText(const std::string& path)

Data#

Utility functions to allow easy re-use of common operations across dataloaders

nerfstudio.data.utils.data_utils.get_depth_image_from_path(filepath: Path, height: int, width: int, scale_factor: float, interpolation: int = 0) → Tensor[source]#

Loads, rescales and resizes depth images. Filepath points to a 16-bit or 32-bit depth image, or a numpy array *.npy.

Parameters:

filepath – Path to depth image.
height – Target depth image height.
width – Target depth image width.
scale_factor – Factor by which to scale depth image.
interpolation – Depth value interpolation for resizing.

Returns:

Depth image torch tensor with shape [height, width, 1].

nerfstudio.data.utils.data_utils.get_image_mask_tensor_from_path(filepath: Union[Path, IO[bytes]], scale_factor: float = 1.0) → Tensor[source]#: Utility function to read a mask image from the given path and return a boolean tensor

nerfstudio.data.utils.data_utils.get_semantics_and_mask_tensors_from_path(filepath: Path, mask_indices: Union[List, Tensor], scale_factor: float = 1.0) → Tuple[Tensor, Tensor][source]#: Utility function to read segmentation from the given filepath If no mask is required - use mask_indices = []

nerfstudio.data.utils.data_utils.identity_collate(x)[source]#: This function does nothing but serves to help our dataloaders have a pickleable function, as lambdas are not pickleable

nerfstudio.data.utils.data_utils.pil_to_numpy(im: Image) → ndarray[source]#

Converts a PIL Image object to a NumPy array.

Parameters:: im (PIL.Image.Image) – The input PIL Image object.
Returns:: numpy.ndarray representing the image data.

Dataloader#

Code for sampling images from a dataset of images.

class nerfstudio.data.utils.dataloaders.CacheDataloader(dataset: ~torch.utils.data.dataset.Dataset, num_images_to_sample_from: ~typing.Union[int, float] = inf, num_times_to_repeat_images: ~typing.Union[int, float] = inf, device: ~typing.Union[~torch.device, str] = 'cpu', collate_fn: ~typing.Callable[[~typing.Any], ~typing.Any] = <function nerfstudio_collate>, exclude_batch_keys_from_device: ~typing.Optional[~typing.List[str]] = None, **kwargs)[source]#

Collated image dataset that implements caching of default-pytorch-collatable data. Creates batches of the InputDataset return type.

Parameters:

dataset – Dataset to sample from.
num_samples_to_collate – How many images to sample rays for each batch. -1 or infinity for all images.
num_times_to_repeat_images – How often to yield an image batch before resampling. -1 or infinity to never pick new images.
device – Device to perform computation.
collate_fn – The function we will use to collate our training data

class nerfstudio.data.utils.dataloaders.EvalDataloader(input_dataset: InputDataset, device: Union[device, str] = 'cpu', **kwargs)[source]#

Evaluation dataloader base class

Parameters:

input_dataset – InputDataset to load data from
device – Device to load data to

abstract __iter__()[source]#: Iterates over the dataset

abstract __next__() → Tuple[RayBundle, Dict][source]#: Returns the next batch of data

get_camera(image_idx: int = 0) → Tuple[Cameras, Dict][source]#

Get camera for the given image index

Parameters:: image_idx – Camera image index

get_data_from_image_idx(image_idx: int) → Tuple[RayBundle, Dict][source]#

Returns the data for a specific image index.

Parameters:: image_idx – Camera image index

class nerfstudio.data.utils.dataloaders.FixedIndicesEvalDataloader(input_dataset: InputDataset, image_indices: Optional[Tuple[int]] = None, device: Union[device, str] = 'cpu', **kwargs)[source]#

Dataloader that returns a fixed set of indices.

Parameters:

input_dataset – InputDataset to load data from
image_indices – List of image indices to load data from. If None, then use all images.
device – Device to load data to

class nerfstudio.data.utils.dataloaders.ImageBatchStream(input_dataset: InputDataset, sampling_seed: int = 3301, cache_images_type: Literal['uint8', 'float32'] = 'float32', device: Union[device, str] = 'cpu', custom_image_processor: Optional[Callable[[Cameras, Dict], Tuple[Cameras, Dict]]] = None)[source]#: A wrapper of InputDataset that outputs undistorted full images and cameras. This makes the datamanager more lightweight since we don’t have to do generate rays. Useful for full-image training e.g. rasterization pipelines

class nerfstudio.data.utils.dataloaders.RandIndicesEvalDataloader(input_dataset: InputDataset, device: Union[device, str] = 'cpu', **kwargs)[source]#: Dataloader that returns random images. :param input_dataset: InputDataset to load data from :param device: Device to load data to

class nerfstudio.data.utils.dataloaders.RayBatchStream(input_dataset: ~nerfstudio.data.datasets.base_dataset.InputDataset, sampling_seed: int = 3301, num_rays_per_batch: int = 1024, num_images_to_sample_from: ~typing.Union[int, float] = inf, num_times_to_repeat_images: ~typing.Union[int, float] = inf, device: ~typing.Union[~torch.device, str] = 'cpu', collate_fn: ~typing.Callable[[~typing.Any], ~typing.Any] = <staticmethod object>, num_image_load_threads: int = 4, exclude_batch_keys_from_device: ~typing.Optional[~typing.List[str]] = None, load_from_disk: bool = False, patch_size: int = 1, custom_ray_processor: ~typing.Optional[~typing.Callable[[~nerfstudio.cameras.rays.RayBundle, ~typing.Dict], ~typing.Tuple[~nerfstudio.cameras.rays.RayBundle, ~typing.Dict]]] = None)[source]#

Wrapper around Pytorch’s IterableDataset to generate the next batch of rays (next RayBundle) and corresponding labels with multiple parallel workers.

Each worker samples a small batch of images, pixel samples those images, and generates rays for one training step. The same batch of images can be pixel sampled multiple times hasten ray generation, as retrieving images is process bottlenecked by disk read speed. To avoid Out-Of-Memory (OOM) errors, this batch of images is small and regenerated by resampling the worker’s partition of images to maintain sampling diversity.

__iter__()[source]#: This implementation allows every worker only cache the indices of the images they will use to generate rays to conserve RAM memory.

collate_fn#: What collate function is used to batch images to be used for pixel sampling and ray generation.

device#: If a CUDA GPU is present, self.device will be set to use that GPU.

exclude_batch_keys_from_device#: Which key of the batch (such as ‘image’, ‘mask’,’depth’) to prevent from moving to the device. For instance, if you would like to conserve GPU memory, don’t move the image tensors to the GPU, which comes at a cost of total training time. The default value is [‘image’].

load_from_disk#: If True, conserves RAM memory by loading images from disk. If False, each worker caches all the images in its dataset partition as tensors to RAM and loads from RAM.

num_image_load_threads#: Number of threads created to read images from disk and form collated batches.

num_images_to_sample_from#: How many images to sample to generate a RayBundle. More images means greater sampling diversity at expense of increased RAM usage.

num_rays_per_batch#: Number of rays per batch to user per training iteration.

num_times_to_repeat_images#: How many RayBundles to generate from this batch of images after sampling num_images_to_sample_from images.

patch_size#: Size of patch to sample from. If > 1, patch-based sampling will be used.

pixel_sampler_config: PixelSamplerConfig#: Specifies the pixel sampler config used to sample pixels from images. Each worker will have its own pixel sampler

ray_generator: Optional[RayGenerator]#: Each worker will have its own ray generator, so this is set to None for now.

nerfstudio.data.utils.dataloaders.undistort_view(idx: int, dataset: InputDataset, image_type: Literal['uint8', 'float32'] = 'float32') → Tuple[Cameras, Dict][source]#

Undistorts an image to one taken by a linear (pinhole) camera model and returns a new Camera with these updated intrinsics Note: this method does not modify the dataset’s attributes at all.

Returns: The undistorted data (image, depth, mask, etc.) and the new linear Camera object

nerfstudio.data.utils.dataloaders.variable_res_collate(batch: List[Dict]) → Dict[source]#

Default collate function for our dataloader. :param batch: Batch of samples from the dataset.

Returns:: Collated batch.

Nerfstudio Collate#

Custom collate function that includes cases for nerfstudio types.

nerfstudio.data.utils.nerfstudio_collate.nerfstudio_collate(batch: Any, extra_mappings: Optional[Dict[type, Callable]] = None) → Any[source]#

This is the default pytorch collate function, but with support for nerfstudio types. All documentation below is copied straight over from pytorch’s default_collate function, python version 3.8.13, pytorch version ‘1.12.1+cu113’. Custom nerfstudio types are accounted for at the end, and extra mappings can be passed in to handle custom types. These mappings are from types: callable (types being like int or float or the return value of type(3.), etc). The only code before we parse for custom types that was changed from default pytorch was the addition of the extra_mappings argument, a find and replace operation from default_collate to nerfstudio_collate, and the addition of the nerfstudio_collate_err_msg_format variable.

Function that takes in a batch of data and puts the elements within the batch into a tensor with an additional outer dimension - batch size. The exact output type can be a torch.Tensor, a Sequence of torch.Tensor, a Collection of torch.Tensor, or left unchanged, depending on the input type. This is used as the default function for collation when batch_size or batch_sampler is defined in DataLoader.

Here is the general input type (based on the type of the element within the batch) to output type mapping:

torch.Tensor -> torch.Tensor (with an added outer dimension batch size)

NumPy Arrays -> torch.Tensor

float -> torch.Tensor

int -> torch.Tensor

str -> str (unchanged)

bytes -> bytes (unchanged)

Mapping[K, V_i] -> Mapping[K, nerfstudio_collate([V_1, V_2, …])]

NamedTuple[V1_i, V2_i, …] -> NamedTuple[nerfstudio_collate([V1_1, V1_2, …]), nerfstudio_collate([V2_1, V2_2, …]), …]

Sequence[V1_i, V2_i, …] -> Sequence[nerfstudio_collate([V1_1, V1_2, …]), nerfstudio_collate([V2_1, V2_2, …]), …]

Parameters:: batch – a single batch to be collated

Examples

>>> # Example with a batch of `int`s:
>>> nerfstudio_collate([0, 1, 2, 3])
tensor([0, 1, 2, 3])
>>> # Example with a batch of `str`s:
>>> nerfstudio_collate(['a', 'b', 'c'])
['a', 'b', 'c']
>>> # Example with `Map` inside the batch:
>>> nerfstudio_collate([{'A': 0, 'B': 1}, {'A': 100, 'B': 100}])
{'A': tensor([  0, 100]), 'B': tensor([  1, 100])}
>>> # Example with `NamedTuple` inside the batch:
>>> Point = namedtuple('Point', ['x', 'y'])
>>> nerfstudio_collate([Point(0, 0), Point(1, 1)])
Point(x=tensor([0, 1]), y=tensor([0, 1]))
>>> # Example with `Tuple` inside the batch:
>>> nerfstudio_collate([(0, 1), (2, 3)])
[tensor([0, 2]), tensor([1, 3])]
>>> # Example with `List` inside the batch:
>>> nerfstudio_collate([[0, 1], [2, 3]])
[tensor([0, 2]), tensor([1, 3])]