../../_images/pipeline_parser-light.png ../../_images/pipeline_parser-dark.png

What is a DataParser?#

The dataparser returns DataparserOutputs, which puts all the various datasets into a common format. The DataparserOutputs should be lightweight, containing filenames or other meta information which can later be processed by actual PyTorch Datasets and Dataloaders. The common format makes it easy to add another DataParser. All you have to do is implement the private method _generate_dataparser_outputs shown below.

class DataparserOutputs:
    """Dataparser outputs for the which will be used by the DataManager
    for creating RayBundle and RayGT objects."""

    image_filenames: List[Path]
    """Filenames for the images."""
    cameras: Cameras
    """Camera object storing collection of camera information in dataset."""
    alpha_color: Optional[TensorType[3]] = None
    """Color of dataset background."""
    scene_box: SceneBox = SceneBox()
    """Scene box of dataset. Used to bound the scene or provide the scene scale depending on model."""
    semantics: Optional[Semantics] = None
    """Semantics information."""

class DataParser:

    def _generate_dataparser_outputs(self, split: str = "train") -> DataparserOutputs:
        """Abstract method that returns the dataparser outputs for the given split.

            split: Which dataset split to generate (train/test).

            DataparserOutputs containing data for the specified dataset and split


Here is an example where we implement a DataParser for our Nerfstudio data format.

class NerfstudioDataParserConfig(DataParserConfig):
    """Nerfstudio dataset config"""

    _target: Type = field(default_factory=lambda: Nerfstudio)
    """target class to instantiate"""
    data: Path = Path("data/nerfstudio/poster")
    """Directory specifying location of data."""
    scale_factor: float = 1.0
    """How much to scale the camera origins by."""
    downscale_factor: Optional[int] = None
    """How much to downscale images. If not set, images are chosen such that the max dimension is <1600px."""
    scene_scale: float = 1.0
    """How much to scale the region of interest by."""
    orientation_method: Literal["pca", "up"] = "up"
    """The method to use for orientation."""
    train_split_percentage: float = 0.9
    """The percent of images to use for training. The remaining images are for eval."""

class Nerfstudio(DataParser):
    """Nerfstudio DatasetParser"""

    config: NerfstudioDataParserConfig

    def _generate_dataparser_outputs(self, split="train"):
        meta = load_from_json( / "transforms.json")
        image_filenames = []
        poses = []
        dataparser_outputs = DataparserOutputs(
        return dataparser_outputs

Train and Eval Logic#

The DataParser will generate a train and eval DataparserOutputs depending on the split argument. For example, here is how you’d initialize some InputDataset classes that live in the DataManager. Because our DataparserOutputs maintain a common form, our Datasets should be plug-and-play. These datasets will load images needed to supervise the model with RayGT objects.

config = NerfstudioDataParserConfig()
dataparser = config.setup()
# train dataparser
dataparser_outputs = dataparser.get_dataparser_outputs(split="train")
input_dataset = InputDataset(dataparser_outputs)

You can also pull out information from the DataParserOutputs for other DataMangager componenets, such as the RayGenerator. The RayGenerator generates RayBundle objects from camera and pixel indices.

ray_generator = RayGenerator(dataparser_outputs.cameras)

Our Implementations#

Below we enumerate the various dataparsers that we have implemented in our codebase. Feel free to use ours or add your own. Also any contributions are welcome and appreciated!


This is our custom dataparser. We have a script to convert images or videos with COLMAP to this format.

See the code!


We support the synthetic Blender dataset from the original NeRF paper.

See the code!

Instant NGP#

This supports the Instant NGP dataset.

See the code!


This dataparser can use recorded data from a >= iPhone 12 Pro using the Record3D app . Record a video and export with the EXR + JPG sequence format. Unzip export and rgb folder before training.

For more information on capturing with Record3D, see the Custom Dataset Docs.

See the code!