Dataset Ops documentation¶

Friendly dataset operations for your data science needs. Dataset Ops provides declarative loading, sampling, splitting and transformation operations for datasets, alongside export options for easy integration with Tensorflow and PyTorch.

Illustration Dataset Ops Pipeline. Several built-in loaders makes it possible to load datasets stored in various formats. Several operators are provided that provide common pre-processing steps to be applied to the data quickly. Finally, the processed data can be used as is or exported in a format to be used with ML frameworks.¶

First Steps¶

Are you looking for ways to install the framework or do you looking for inspiration to get started?

Installing: Installing
Getting Started: Getting started

Loaders and Transforms¶

Get an overview of the available loaders and transforms that can be used with your dataset.

Loaders: Standard loaders
Transforms: General | Image | Time-series

It is also possible to implement your own loaders and transforms.

Custom Loaders and Transforms¶

Is your dataset structured in a way thats not compatible with any standard loaders? Or does your application require very specific and complex transformations to be applied to the data? The framework makes integration with custom loaders and transforms easy and clean. For how-to guides on how to do this see:

User-Defined: Loaders | Transforms

Performance And Optimizations¶

Are you looking for ways to reduce the time required to load and process big datasets? The library provides several mechanisms that can drastically reduce the time required.

Increasing performance: Caching | Multiprocessing

API Reference¶

Examples¶

Looking for more concrete examples of how datasets may be loaded and transformed? See the example section:

Examples: KITTY | domain adaptation

Examples:

Developer And Contributor Guide¶

Are you looking to contribute to the project or are you already a developer? Contributions of any size and form are always welcomed. Information on how to the codebase is tested, how it is published, and how to add documentation can below:

Quality Assurance And CI: Testing | Git Workflow | CI
How To Contribute: Communication channels | Writing documentation