Datasets

Below is a list of the datasets we support as part of the framework. They all inherit from Pytorch Geometric dataset and they can be accessed either as raw datasets or wrapped into a base class that builds test, train and validations data loaders for you. This base class also provides a helper functions for pre-computing neighboors and point cloud sampling at data loading time.

ShapeNet

Raw dataset

class torch_points3d.datasets.segmentation.ShapeNet(root, categories=None, include_normals=True, split='trainval', transform=None, pre_transform=None, pre_filter=None, is_test=False)[source]

The ShapeNet part level segmentation dataset from the “A Scalable Active Framework for Region Annotation in 3D Shape Collections” paper, containing about 17,000 3D shape point clouds from 16 shape categories. Each category is annotated with 2 to 6 parts.

Parameters
  • root (string) – Root directory where the dataset should be saved.

  • categories (string or [string], optional) – The category of the CAD models (one or a combination of "Airplane", "Bag", "Cap", "Car", "Chair", "Earphone", "Guitar", "Knife", "Lamp", "Laptop", "Motorbike", "Mug", "Pistol", "Rocket", "Skateboard", "Table"). Can be explicitly set to None to load all categories. (default: None)

  • include_normals (bool, optional) – If set to False, will not include normal vectors as input features. (default: True)

  • split (string, optional) – If "train", loads the training dataset. If "val", loads the validation dataset. If "trainval", loads the training and validation dataset. If "test", loads the test dataset. (default: "trainval")

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • pre_filter (callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

Wrapped dataset

class torch_points3d.datasets.segmentation.ShapeNetDataset(dataset_opt)[source]

Wrapper around ShapeNet that creates train and test datasets.

Parameters

dataset_opt (omegaconf.DictConfig) –

Config dictionary that should contain

  • dataroot

  • category: List of categories or All

  • normal: bool, include normals or not

  • pre_transforms

  • train_transforms

  • test_transforms

  • val_transforms

S3DIS

Raw dataset

class torch_points3d.datasets.segmentation.S3DISOriginalFused(root, test_area=6, split='train', transform=None, pre_transform=None, pre_collate_transform=None, pre_filter=None, keep_instance=False, verbose=False, debug=False)[source]

Original S3DIS dataset. Each area is loaded individually and can be processed using a pre_collate transform. This transform can be used for example to fuse the area into a single space and split it into spheres or smaller regions. If no fusion is applied, each element in the dataset is a single room by default.

http://buildingparser.stanford.edu/dataset.html

Parameters
  • root (str) – path to the directory where the data will be saved

  • test_area (int) – number between 1 and 6 that denotes the area used for testing

  • split (str) – can be one of train, trainval, val or test

  • pre_collate_transform – Transforms to be applied before the data is assembled into samples (apply fusing here for example)

  • keep_instance (bool) – set to True if you wish to keep instance data

  • pre_transform

  • transform

  • pre_filter

class torch_points3d.datasets.segmentation.S3DISSphere(root, sample_per_epoch=100, radius=2, *args, **kwargs)[source]

Small variation of S3DISOriginalFused that allows random sampling of spheres within an Area during training and validation. Spheres have a radius of 2m. If sample_per_epoch is not specified, spheres are taken on a 2m grid.

http://buildingparser.stanford.edu/dataset.html

Parameters
  • root (str) – path to the directory where the data will be saved

  • test_area (int) – number between 1 and 6 that denotes the area used for testing

  • train (bool) – Is this a train split or not

  • pre_collate_transform – Transforms to be applied before the data is assembled into samples (apply fusing here for example)

  • keep_instance (bool) – set to True if you wish to keep instance data

  • sample_per_epoch – Number of spheres that are randomly sampled at each epoch (-1 for fixed grid)

  • radius – radius of each sphere

  • pre_transform

  • transform

  • pre_filter

Wrapped dataset

class torch_points3d.datasets.segmentation.S3DIS1x1Dataset(dataset_opt)[source]
class torch_points3d.datasets.segmentation.S3DISFusedDataset(dataset_opt)[source]

Wrapper around S3DISSphere that creates train and test datasets.

http://buildingparser.stanford.edu/dataset.html

Parameters

dataset_opt (omegaconf.DictConfig) –

Config dictionary that should contain

  • dataroot

  • fold: test_area parameter

  • pre_collate_transform

  • train_transforms

  • test_transforms

Scannet

Raw dataset

class torch_points3d.datasets.segmentation.Scannet(root, split='train', transform=None, pre_transform=None, pre_filter=None, version='v2', use_instance_labels=False, use_instance_bboxes=False, donotcare_class_ids=[], max_num_point=None, process_workers=4, types=['.txt', '_vh_clean_2.ply', '_vh_clean_2.0.010000.segs.json', '.aggregation.json'], normalize_rgb=True, is_test=False)[source]

Scannet dataset, you will have to agree to terms and conditions by hitting enter so that it downloads the dataset.

http://www.scan-net.org/

Parameters
  • root (str) – Path to the data

  • split (str, optional) – Split used (train, val or test)

  • (callable, optional) (pre_filter) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.

  • (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.

  • (callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.

  • version (str, optional) – version of scannet, by default “v2”

  • use_instance_labels (bool, optional) – Wether we use instance labels or not, by default False

  • use_instance_bboxes (bool, optional) – Wether we use bounding box labels or not, by default False

  • donotcare_class_ids (list, optional) – Class ids to be discarded

  • max_num_point ([type], optional) – Max number of points to keep during the pre processing step

  • use_multiprocessing (bool, optional) – Wether we use multiprocessing or not

  • process_workers (int, optional) – Number of process workers

  • normalize_rgb (bool, optional) – Normalise rgb values, by default True

Wrapped dataset

class torch_points3d.datasets.segmentation.ScannetDataset(dataset_opt)[source]

Wrapper around Scannet that creates train and test datasets.

Parameters

dataset_opt (omegaconf.DictConfig) –

Config dictionary that should contain

  • dataroot

  • version

  • max_num_point (optional)

  • use_instance_labels (optional)

  • use_instance_bboxes (optional)

  • donotcare_class_ids (optional)

  • pre_transforms (optional)

  • train_transforms (optional)

  • val_transforms (optional)