load_dataset package
Contents
load_dataset package#
Submodules#
load_dataset.load_dataset module#
- load_dataset.load_dataset.load_dataset(context: mlrun.execution.MLClientCtx, dataset: str, name: str = '', file_ext: str = 'parquet', params: dict = {}) → None[source]#
Loads a scikit-learn toy dataset for classification or regression
The following datasets are available (‘name’ : desription):
‘boston’ : boston house-prices dataset (regression) ‘iris’ : iris dataset (classification) ‘diabetes’ : diabetes dataset (regression) ‘digits’ : digits dataset (classification) ‘linnerud’ : linnerud dataset (multivariate regression) ‘wine’ : wine dataset (classification) ‘breast_cancer’ : breast cancer wisconsin dataset (classification)
The scikit-learn functions return a data bunch including the following items: - data the features matrix - target the ground truth labels - DESCR a description of the dataset - feature_names header for data
The features (and their names) are stored with the target labels in a DataFrame.
For further details see https://scikit-learn.org/stable/datasets/index.html#toy-datasets
- Parameters
context – function execution context
dataset – name of the dataset to load
name – artifact name (defaults to dataset)
file_ext – output file_ext: parquet or csv
params – params of the sklearn load_data method