Datasets¶

Below is a list of the datasets we support as part of the framework. They all inherit from Pytorch Geometric dataset and they can be accessed either as raw datasets or wrapped into a base class that builds test, train and validations data loaders for you. This base class also provides a helper functions for pre-computing neighboors and point cloud sampling at data loading time.

ShapeNet¶

Raw dataset¶

class torch_points3d.datasets.segmentation.ShapeNet(root, categories=None, include_normals=True, split='trainval', transform=None, pre_transform=None, pre_filter=None, is_test=False)[source]¶

The ShapeNet part level segmentation dataset from the “A Scalable Active Framework for Region Annotation in 3D Shape Collections” paper, containing about 17,000 3D shape point clouds from 16 shape categories. Each category is annotated with 2 to 6 parts.

Parameters

root (string) – Root directory where the dataset should be saved.
categories (string or [string], optional) – The category of the CAD models (one or a combination of "Airplane", "Bag", "Cap", "Car", "Chair", "Earphone", "Guitar", "Knife", "Lamp", "Laptop", "Motorbike", "Mug", "Pistol", "Rocket", "Skateboard", "Table"). Can be explicitly set to None to load all categories. (default: None)
include_normals (bool, optional) – If set to False, will not include normal vectors as input features. (default: True)
split (string, optional) – If "train", loads the training dataset. If "val", loads the validation dataset. If "trainval", loads the training and validation dataset. If "test", loads the test dataset. (default: "trainval")
transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)
pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)
pre_filter (callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

Wrapped dataset¶

class torch_points3d.datasets.segmentation.ShapeNetDataset(dataset_opt)[source]¶

Wrapper around ShapeNet that creates train and test datasets.

Parameters

dataset_opt (omegaconf.DictConfig) –

Config dictionary that should contain

dataroot

category: List of categories or All

normal: bool, include normals or not

pre_transforms

train_transforms

test_transforms

val_transforms

S3DIS¶

Raw dataset¶

class torch_points3d.datasets.segmentation.S3DISOriginalFused(root, test_area=6, split='train', transform=None, pre_transform=None, pre_collate_transform=None, pre_filter=None, keep_instance=False, verbose=False, debug=False)[source]¶

Original S3DIS dataset. Each area is loaded individually and can be processed using a pre_collate transform. This transform can be used for example to fuse the area into a single space and split it into spheres or smaller regions. If no fusion is applied, each element in the dataset is a single room by default.

http://buildingparser.stanford.edu/dataset.html

Parameters

root (str) – path to the directory where the data will be saved
test_area (int) – number between 1 and 6 that denotes the area used for testing
split (str) – can be one of train, trainval, val or test
pre_collate_transform – Transforms to be applied before the data is assembled into samples (apply fusing here for example)
keep_instance (bool) – set to True if you wish to keep instance data
pre_transform –
transform –
pre_filter –

class torch_points3d.datasets.segmentation.S3DISSphere(root, sample_per_epoch=100, radius=2, *args, **kwargs)[source]¶

Small variation of S3DISOriginalFused that allows random sampling of spheres within an Area during training and validation. Spheres have a radius of 2m. If sample_per_epoch is not specified, spheres are taken on a 2m grid.

http://buildingparser.stanford.edu/dataset.html

Parameters

root (str) – path to the directory where the data will be saved
test_area (int) – number between 1 and 6 that denotes the area used for testing
train (bool) – Is this a train split or not
pre_collate_transform – Transforms to be applied before the data is assembled into samples (apply fusing here for example)
keep_instance (bool) – set to True if you wish to keep instance data
sample_per_epoch – Number of spheres that are randomly sampled at each epoch (-1 for fixed grid)
radius – radius of each sphere
pre_transform –
transform –
pre_filter –

Wrapped dataset¶

class torch_points3d.datasets.segmentation.S3DIS1x1Dataset(dataset_opt)[source]¶

class torch_points3d.datasets.segmentation.S3DISFusedDataset(dataset_opt)[source]¶

Wrapper around S3DISSphere that creates train and test datasets.

http://buildingparser.stanford.edu/dataset.html

Parameters

dataset_opt (omegaconf.DictConfig) –

Config dictionary that should contain

dataroot

fold: test_area parameter

pre_collate_transform

train_transforms

test_transforms

Scannet¶

Raw dataset¶

class torch_points3d.datasets.segmentation.Scannet(root, split='train', transform=None, pre_transform=None, pre_filter=None, version='v2', use_instance_labels=False, use_instance_bboxes=False, donotcare_class_ids=[], max_num_point=None, process_workers=4, types=['.txt', '_vh_clean_2.ply', '_vh_clean_2.0.010000.segs.json', '.aggregation.json'], normalize_rgb=True, is_test=False)[source]¶

Scannet dataset, you will have to agree to terms and conditions by hitting enter so that it downloads the dataset.

http://www.scan-net.org/

Parameters

root (str) – Path to the data
split (str, optional) – Split used (train, val or test)
(callable, optional) (pre_filter) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access.
(callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk.
(callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset.
version (str, optional) – version of scannet, by default “v2”
use_instance_labels (bool, optional) – Wether we use instance labels or not, by default False
use_instance_bboxes (bool, optional) – Wether we use bounding box labels or not, by default False
donotcare_class_ids (list, optional) – Class ids to be discarded
max_num_point ([type], optional) – Max number of points to keep during the pre processing step
use_multiprocessing (bool, optional) – Wether we use multiprocessing or not
process_workers (int, optional) – Number of process workers
normalize_rgb (bool, optional) – Normalise rgb values, by default True

Wrapped dataset¶

class torch_points3d.datasets.segmentation.ScannetDataset(dataset_opt)[source]¶

Wrapper around Scannet that creates train and test datasets.

Parameters

dataset_opt (omegaconf.DictConfig) –

Config dictionary that should contain

dataroot

version

max_num_point (optional)

use_instance_labels (optional)

use_instance_bboxes (optional)

donotcare_class_ids (optional)

pre_transforms (optional)

train_transforms (optional)

val_transforms (optional)