azureml_utils package
Contents
azureml_utils package#
Submodules#
azureml_utils.azureml_utils module#
- azureml_utils.azureml_utils.download_model(context: mlrun.execution.MLClientCtx, model_name: str, model_version: int, target_dir: str = '.') → None[source]#
Download trained model from Azure ML to local filesystem.
- Parameters
context – MLRun context.
model_name – Name of trained and registered model.
model_version – Version of model to download.
target_dir – Target directory to download model.
- azureml_utils.azureml_utils.init_compute(context: mlrun.execution.MLClientCtx, cpu_cluster_name: str, vm_size: str = 'STANDARD_D2_V2', max_nodes: int = 1) → azureml.core.compute.ComputeTarget[source]#
Initialize Azure ML compute target to run experiment. Checks for existing compute target and creates new if does not exist.
- Parameters
context – MLRun context.
cpu_cluster_name – Name of Azure ML compute target. Created if does not exist.
vm_size – Azure machine type for compute target.
max_nodes – Maximum number of concurrent compute targets.
- Returns
Azure ML Compute Target.
- azureml_utils.azureml_utils.register_dataset(context: mlrun.execution.MLClientCtx, dataset_name: str, dataset_description: str, data: mlrun.datastore.base.DataItem, create_new_version: bool = False)[source]#
Register dataset object (can be also an Iguazio FeatureVector) in Azure ML. Uploads parquet file to Azure blob storage and registers that file as a dataset in Azure ML.
- Parameters
context – MLRun context.
dataset_name – Name of Azure dataset to register.
dataset_description – Description of Azure dataset to register.
data – MLRun FeatureVector or dataset object to upload.
create_new_version – Register Azure dataset as new version. Must be used when modifying dataset schema.
- azureml_utils.azureml_utils.submit_training_job(context: mlrun.execution.MLClientCtx, experiment: azureml.core.experiment.Experiment, compute_target: azureml.core.compute.ComputeTarget, register_model_name: str, registered_dataset_name: str, automl_settings: dict, training_set: mlrun.datastore.base.DataItem, label_column_name: str = '', save_n_models: int = 3, show_output: bool = True) → None[source]#
Submit training job to Azure AutoML and download trained model when completed. Uses previously registered dataset for training.
- Parameters
context – MLRun context.
experiment – Azure experiment.
compute_target – Azure compute target.
register_model_name – Name of model to register in Azure.
registered_dataset_name – Name of dataset registered in Azure ML.
label_column_name – Name of target column in dataset.
automl_settings – JSON string of all Azure AutoML settings.
training_set – Training set to log with model. For model monitoring integration.
show_output – Displaying Azure logs.
save_n_models – How many of the top performing models to log.
- azureml_utils.azureml_utils.train(context: mlrun.execution.MLClientCtx, dataset: mlrun.datastore.base.DataItem, experiment_name: str = '', cpu_cluster_name: str = '', vm_size: str = 'STANDARD_D2_V2', max_nodes: int = 1, dataset_name: str = '', dataset_description: str = '', create_new_version: bool = False, label_column_name: str = '', register_model_name: str = '', save_n_models: int = 1, log_azure: bool = True, automl_settings: Optional[str] = None) → None[source]#
Whole training flow for Azure AutoML. Registers dataset/feature vector, submits training job to Azure AutoML, and downloads trained model when completed.
- Parameters
context – MLRun context.
dataset – MLRun FeatureVector or dataset URI to upload. Will drop index before uploading when it is a FeatureVector.
experiment_name – Name of experiment to create in Azure ML.
cpu_cluster_name – Name of Azure ML compute target. Created if does not exist.
vm_size – Azure machine type for compute target.
max_nodes – Maximum number of concurrent compute targets.
dataset_name – Name of Azure dataset to register.
dataset_description – Description of Azure dataset to register.
create_new_version – Register Azure dataset as new version. Must be used when modifying dataset schema.
label_column_name – Target column in dataset.
register_model_name – Name of model to register in Azure.
save_n_models – How many of the top performing models to log.
log_azure – Displaying Azure logs.
automl_settings – JSON string of all Azure AutoML settings.
- azureml_utils.azureml_utils.upload_model(context: mlrun.execution.MLClientCtx, model_name: str, model_path: str, model_description: Optional[str] = None, model_tags: Optional[dict] = None) → None[source]#
Upload pre-trained model from local filesystem to Azure ML. :param context: MLRun context. :param model_name: Name of trained and registered model. :param model_path: Path to file on local filesystem. :param model_description: Description of models. :param model_tags: KV pairs of model tags.