datasetops.loaders
¶
Module defining loaders for several formats which are commonly used to exchange datasets. Additionally, the module provides adapters for the dataset types used by various ML frameworks.
Module Contents¶
-
class
datasetops.loaders.
Loader
(getdata: Callable[[Any], Any], identifier: Optional[str] = None, name: str = None)¶ Bases:
datasetops.dataset.Dataset
-
append
(self, identifier: Data)¶
-
extend
(self, ids: Union[List[Data], np.ndarray])¶
-
-
datasetops.loaders.
from_pytorch
(pytorch_dataset, identifier: Optional[str] = None)¶ Create dataset from a Pytorch dataset
- Arguments:
tf_dataset {torch.utils.data.Dataset} – A Pytorch dataset to load from identifier {Optional[str]} – unique identifier
- Returns:
[Dataset] – A datasetops.Dataset
-
datasetops.loaders.
from_tensorflow
(tf_dataset, identifier: Optional[str] = None)¶ Create dataset from a Tensorflow dataset
- Arguments:
tf_dataset {tf.data.Dataset} – A Tensorflow dataset to load from identifier {Optional[str]} – unique identifier
- Raises:
AssertionError: Raises error if Tensorflow is not executing eagerly
- Returns:
[Dataset] – A datasetops.Dataset
-
datasetops.loaders.
from_folder_data
(path: AnyPath) → Dataset¶ Load data from a folder with the data structure:
folder ├ sample1.jpg ├ sample2.jpg
- Arguments:
path {AnyPath} – path to folder
- Returns:
- Dataset – A dataset of data paths,
e.g. (‘nested_folder/class1/sample1.jpg’)
-
datasetops.loaders.
from_folder_class_data
(path: AnyPath) → Dataset¶ Load data from a folder with the data structure:
` data ├── class1 │ ├── sample1.jpg │ └── sample2.jpg └── class2 ****└── sample3.jpg `
- Arguments:
path {AnyPath} – path to nested folder
- Returns:
- Dataset – A labelled dataset of data paths and corresponding class labels,
e.g. (‘nested_folder/class1/sample1.jpg’, ‘class1’)
-
datasetops.loaders.
from_folder_group_data
(path: AnyPath) → Dataset¶ Load data from a folder with the data structure:
data ├── group1 │ ├── sample1.jpg │ └── sample2.jpg └── group2 ….├── sample1.jpg ….└── sample2.jpg
- Arguments:
path {AnyPath} – path to nested folder
- Returns:
- Dataset – A dataset of paths to objects of each groups zipped together with corresponding names,
e.g. (‘nested_folder/group1/sample1.jpg’, ‘nested_folder/group2/sample1.txt’)
-
datasetops.loaders.
from_folder_dataset_class_data
(path: AnyPath) → List[Dataset]¶ Load data from a folder with the data structure:
` data ├── dataset1 │ ├── class1 │ │ ├── sample1.jpg │ │ └── sample2.jpg │ └── class2 │ └── sample3.jpg └── dataset2 ****└── sample3.jpg `
- Arguments:
path {AnyPath} – path to nested folder
- Returns:
- List[Dataset] – A list of labelled datasets, each with data paths and corresponding class labels,
e.g. (‘nested_folder/class1/sample1.jpg’, ‘class1’)
-
datasetops.loaders.
from_folder_dataset_group_data
(path: AnyPath) → List[Dataset]¶ Load data from a folder with the data structure:
- Arguments:
path {AnyPath} – path to nested folder
- Returns:
- List[Dataset] – A list of datasets, each with data composed from different types,
e.g. (‘nested_folder/group1/sample1.jpg’, ‘nested_folder/group2/sample1.txt’)
-
datasetops.loaders.
_dataset_from_np_dict
(data: Dict[str, np.ndarray], data_keys: List[str], label_key: str = None, name: str = None, identifier: str = None) → Dataset¶
-
datasetops.loaders.
from_mat_single_mult_data
(path: AnyPath) → List[Dataset]¶ Load data from .mat file consisting of multiple data.
E.g. a .mat file with keys [‘X_src’, ‘Y_src’, ‘X_tgt’, ‘Y_tgt’]
- Arguments:
path {AnyPath} – path to .mat file
- Returns:
- List[Dataset] – A list of datasets, where a dataset was created for each suffix
e.g. a dataset with data from the keys (‘X_src’, ‘Y_src’) and from (‘X_tgt’, ‘Y_tgt’)