datasetops.loaders

Module defining loaders for several formats which are commonly used to exchange datasets. Additionally, the module provides adapters for the dataset types used by various ML frameworks.

Module Contents

class datasetops.loaders.Loader(getdata: Callable[[Any], Any], identifier: Optional[str] = None, name: str = None)

Bases: datasetops.dataset.Dataset

append(self, identifier: Data)
extend(self, ids: Union[List[Data], np.ndarray])
datasetops.loaders.from_pytorch(pytorch_dataset, identifier: Optional[str] = None)

Create dataset from a Pytorch dataset

Arguments:

tf_dataset {torch.utils.data.Dataset} – A Pytorch dataset to load from identifier {Optional[str]} – unique identifier

Returns:

[Dataset] – A datasetops.Dataset

datasetops.loaders.from_tensorflow(tf_dataset, identifier: Optional[str] = None)

Create dataset from a Tensorflow dataset

Arguments:

tf_dataset {tf.data.Dataset} – A Tensorflow dataset to load from identifier {Optional[str]} – unique identifier

Raises:

AssertionError: Raises error if Tensorflow is not executing eagerly

Returns:

[Dataset] – A datasetops.Dataset

datasetops.loaders.from_folder_data(path: AnyPath) → Dataset

Load data from a folder with the data structure:

folder ├ sample1.jpg ├ sample2.jpg

Arguments:

path {AnyPath} – path to folder

Returns:
Dataset – A dataset of data paths,

e.g. (‘nested_folder/class1/sample1.jpg’)

datasetops.loaders.from_folder_class_data(path: AnyPath) → Dataset

Load data from a folder with the data structure:

` data ├── class1    ├── sample1.jpg    └── sample2.jpg └── class2 ****└── sample3.jpg `

Arguments:

path {AnyPath} – path to nested folder

Returns:
Dataset – A labelled dataset of data paths and corresponding class labels,

e.g. (‘nested_folder/class1/sample1.jpg’, ‘class1’)

datasetops.loaders.from_folder_group_data(path: AnyPath) → Dataset

Load data from a folder with the data structure:

data ├── group1 │   ├── sample1.jpg │   └── sample2.jpg └── group2 ….├── sample1.jpg ….└── sample2.jpg

Arguments:

path {AnyPath} – path to nested folder

Returns:
Dataset – A dataset of paths to objects of each groups zipped together with corresponding names,

e.g. (‘nested_folder/group1/sample1.jpg’, ‘nested_folder/group2/sample1.txt’)

datasetops.loaders.from_folder_dataset_class_data(path: AnyPath) → List[Dataset]

Load data from a folder with the data structure:

` data ├── dataset1 │   ├── class1 │   │   ├── sample1.jpg │   │   └── sample2.jpg │   └── class2 │       └── sample3.jpg └── dataset2 ****└── sample3.jpg `

Arguments:

path {AnyPath} – path to nested folder

Returns:
List[Dataset] – A list of labelled datasets, each with data paths and corresponding class labels,

e.g. (‘nested_folder/class1/sample1.jpg’, ‘class1’)

datasetops.loaders.from_folder_dataset_group_data(path: AnyPath) → List[Dataset]

Load data from a folder with the data structure:

nested_folder |- dataset1

|- group1

|- sample1.jpg |- sample2.jpg

|- group2

|- sample1.txt |- sample2.txt

|- dataset2

|- …

Arguments:

path {AnyPath} – path to nested folder

Returns:
List[Dataset] – A list of datasets, each with data composed from different types,

e.g. (‘nested_folder/group1/sample1.jpg’, ‘nested_folder/group2/sample1.txt’)

datasetops.loaders._dataset_from_np_dict(data: Dict[str, np.ndarray], data_keys: List[str], label_key: str = None, name: str = None, identifier: str = None) → Dataset
datasetops.loaders.from_mat_single_mult_data(path: AnyPath) → List[Dataset]

Load data from .mat file consisting of multiple data.

E.g. a .mat file with keys [‘X_src’, ‘Y_src’, ‘X_tgt’, ‘Y_tgt’]

Arguments:

path {AnyPath} – path to .mat file

Returns:
List[Dataset] – A list of datasets, where a dataset was created for each suffix

e.g. a dataset with data from the keys (‘X_src’, ‘Y_src’) and from (‘X_tgt’, ‘Y_tgt’)