auto_trainer package#

Submodules#

auto_trainer.auto_trainer module#

class auto_trainer.auto_trainer.KWArgsPrefixes[source]#

Bases: object

FIT = 'FIT_'#
MODEL_CLASS = 'CLASS_'#
PREDICT = 'PREDICT_'#
TRAIN = 'TRAIN_'#
auto_trainer.auto_trainer.evaluate(context: mlrun.execution.MLClientCtx, model: str, dataset: mlrun.datastore.base.DataItem, drop_columns: Optional[List[str]] = None, label_columns: Optional[Union[str, List[str]]] = None)[source]#

Evaluating a model. Artifacts generated by the MLHandler.

Parameters
  • context – MLRun context.

  • model – The model Store path.

  • dataset – The dataset to evaluate the model on. Can be either a URI or a FeatureVector.

  • drop_columns – str or a list of strings that represent the columns to drop.

  • label_columns – The target label(s) of the column(s) in the dataset. for Regression or Classification tasks.

auto_trainer.auto_trainer.predict(context: mlrun.execution.MLClientCtx, model: str, dataset: mlrun.datastore.base.DataItem, drop_columns: Optional[Union[str, List[str], int, List[int]]] = None, label_columns: Optional[Union[str, List[str]]] = None, result_set: Optional[str] = None)[source]#

Predicting dataset by a model.

Parameters
  • context – MLRun context.

  • model – The model Store path.

  • dataset – The dataset to predict the model on. Can be either a URI, a FeatureVector or a sample in a shape of a list/dict. When passing a sample, pass the dataset as a field in params instead of inputs.

  • drop_columns – str/int or a list of strings/ints that represent the column names/indices to drop. When the dataset is a list/dict this parameter should be represented by integers.

  • label_columns – The target label(s) of the column(s) in the dataset. for Regression or Classification tasks.

  • result_set – The db key to set name of the prediction result and the filename. Default to ‘prediction’.

auto_trainer.auto_trainer.train(context: mlrun.execution.MLClientCtx, dataset: mlrun.datastore.base.DataItem, drop_columns: Optional[List[str]] = None, model_class: Optional[str] = None, model_name: str = 'model', tag: str = '', label_columns: Optional[Union[str, List[str]]] = None, sample_set: Optional[mlrun.datastore.base.DataItem] = None, test_set: Optional[mlrun.datastore.base.DataItem] = None, train_test_split_size: Optional[float] = None, random_state: Optional[int] = None)[source]#

Training the given model on the given dataset.

Parameters
  • context – MLRun context

  • dataset – The dataset to train the model on. Can be either a URI or a FeatureVector

  • drop_columns – str or a list of strings that represent the columns to drop

  • model_class – The class of the model, e.g. sklearn.linear_model.LogisticRegression

  • model_name – The model’s name to use for storing the model artifact, default to ‘model’

  • tag – The model’s tag to log with

  • label_columns – The target label(s) of the column(s) in the dataset. for Regression or Classification tasks

  • sample_set – A sample set of inputs for the model for logging its stats along the model in favour of model monitoring. Can be either a URI or a FeatureVector

  • test_set – The test set to train the model with

  • train_test_split_size – Should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. The size of the Training set is set to the complement of this value. Default = 0.2

  • random_state – Random state for train_test_split

Module contents#