azureml_utils package#

Submodules#

azureml_utils.azureml_utils module#

azureml_utils.azureml_utils.download_model(context: mlrun.execution.MLClientCtx, model_name: str, model_version: int, target_dir: str = '.')None[source]#

Download trained model from Azure ML to local filesystem.

Parameters
  • context – MLRun context.

  • model_name – Name of trained and registered model.

  • model_version – Version of model to download.

  • target_dir – Target directory to download model.

azureml_utils.azureml_utils.init_compute(context: mlrun.execution.MLClientCtx, cpu_cluster_name: str, vm_size: str = 'STANDARD_D2_V2', max_nodes: int = 1)azureml.core.compute.ComputeTarget[source]#

Initialize Azure ML compute target to run experiment. Checks for existing compute target and creates new if does not exist.

Parameters
  • context – MLRun context.

  • cpu_cluster_name – Name of Azure ML compute target. Created if does not exist.

  • vm_size – Azure machine type for compute target.

  • max_nodes – Maximum number of concurrent compute targets.

Returns

Azure ML Compute Target.

azureml_utils.azureml_utils.register_dataset(context: mlrun.execution.MLClientCtx, dataset_name: str, dataset_description: str, data: mlrun.datastore.base.DataItem, create_new_version: bool = False)[source]#

Register dataset object (can be also an Iguazio FeatureVector) in Azure ML. Uploads parquet file to Azure blob storage and registers that file as a dataset in Azure ML.

Parameters
  • context – MLRun context.

  • dataset_name – Name of Azure dataset to register.

  • dataset_description – Description of Azure dataset to register.

  • data – MLRun FeatureVector or dataset object to upload.

  • create_new_version – Register Azure dataset as new version. Must be used when modifying dataset schema.

azureml_utils.azureml_utils.submit_training_job(context: mlrun.execution.MLClientCtx, experiment: azureml.core.experiment.Experiment, compute_target: azureml.core.compute.ComputeTarget, register_model_name: str, registered_dataset_name: str, automl_settings: dict, training_set: mlrun.datastore.base.DataItem, label_column_name: str = '', save_n_models: int = 3, show_output: bool = True)None[source]#

Submit training job to Azure AutoML and download trained model when completed. Uses previously registered dataset for training.

Parameters
  • context – MLRun context.

  • experiment – Azure experiment.

  • compute_target – Azure compute target.

  • register_model_name – Name of model to register in Azure.

  • registered_dataset_name – Name of dataset registered in Azure ML.

  • label_column_name – Name of target column in dataset.

  • automl_settings – JSON string of all Azure AutoML settings.

  • training_set – Training set to log with model. For model monitoring integration.

  • show_output – Displaying Azure logs.

  • save_n_models – How many of the top performing models to log.

azureml_utils.azureml_utils.train(context: mlrun.execution.MLClientCtx, dataset: mlrun.datastore.base.DataItem, experiment_name: str = '', cpu_cluster_name: str = '', vm_size: str = 'STANDARD_D2_V2', max_nodes: int = 1, dataset_name: str = '', dataset_description: str = '', create_new_version: bool = False, label_column_name: str = '', register_model_name: str = '', save_n_models: int = 1, log_azure: bool = True, automl_settings: Optional[str] = None)None[source]#

Whole training flow for Azure AutoML. Registers dataset/feature vector, submits training job to Azure AutoML, and downloads trained model when completed.

Parameters
  • context – MLRun context.

  • dataset – MLRun FeatureVector or dataset URI to upload. Will drop index before uploading when it is a FeatureVector.

  • experiment_name – Name of experiment to create in Azure ML.

  • cpu_cluster_name – Name of Azure ML compute target. Created if does not exist.

  • vm_size – Azure machine type for compute target.

  • max_nodes – Maximum number of concurrent compute targets.

  • dataset_name – Name of Azure dataset to register.

  • dataset_description – Description of Azure dataset to register.

  • create_new_version – Register Azure dataset as new version. Must be used when modifying dataset schema.

  • label_column_name – Target column in dataset.

  • register_model_name – Name of model to register in Azure.

  • save_n_models – How many of the top performing models to log.

  • log_azure – Displaying Azure logs.

  • automl_settings – JSON string of all Azure AutoML settings.

azureml_utils.azureml_utils.upload_model(context: mlrun.execution.MLClientCtx, model_name: str, model_path: str, model_description: Optional[str] = None, model_tags: Optional[dict] = None)None[source]#

Upload pre-trained model from local filesystem to Azure ML. :param context: MLRun context. :param model_name: Name of trained and registered model. :param model_path: Path to file on local filesystem. :param model_description: Description of models. :param model_tags: KV pairs of model tags.

Module contents#