ONNX Utils#

A collection of ONNX utils in one MLRun function. The function includes the following handlers:

to_onnx - Convert your model into onnx format.
optimize - Perform ONNX optimizations using onnxmodeloptimizer on a given ONNX model.

1. to_onnx#

1.1. Docs#

Convert the given model to an ONNX model.

Parameters:#

context: mlrun.MLClientCtx - The MLRun function execution context
model_path: str - The model path store object.
onnx_model_name: str = None - The name to use to log the converted ONNX model. If not given, the given model_name will be used with an additional suffix _onnx. Defaulted to None.
optimize_model: bool = True - Whether to optimize the ONNX model using ‘onnxoptimizer’ before saving the model. Defaulted to True.
framework: str = None - The model’s framework. If None, it will be read from the ‘framework’ label of the model artifact provided. Defaulted to None.
framework_kwargs: Dict[str, Any] = None - Additional arguments each framework may require in order to convert to ONNX. To get the doc string of the desired framework onnx conversion function, pass “help”.

Supported keyword arguments (`framework_kwargs`) per framework:#

tensorflow.keras:

input_signature: List[Tuple[Tuple[int], str]] = None - A list of the input layers shape and data type properties. Expected to receive a list where each element is an input layer tuple. An input layer tuple is a tuple of:
- [0] = Layer’s shape, a tuple of integers.
- [1] = Layer’s data type, a mlrun.data_types.ValueType string.
If None, the input signature will be tried to be read automatically before converting to ONNX or from the model artifact if available. Defaulted to None.

torch:

input_signature: List[Tuple[Tuple[int], str]] = None - A list of the input layers shape and data type properties. Expected to receive a list where each element is an input layer tuple. An input layer tuple is a tuple of:
- [0] = Layer’s shape, a tuple of integers.
- [1] = Layer’s data type, a mlrun.data_types.ValueType string.
If None, the input signature will be read from the model artifact if available. Defaulted to None.
input_layers_names: List[str] = None - List of names to assign to the input nodes of the graph in order. All of the other parameters (inner layers) can be set as well by passing additional names in the list. The order is by the order of the parameters in the model. If None, the inputs will be read from the handler’s inputs. If its also None, it is defaulted to: “input_0”, “input_1”, …
output_layers_names: List[str] = None - List of names to assign to the output nodes of the graph in order. If None, the outputs will be read from the handler’s outputs. If its also None, it is defaulted to: “output_0” (for multiple outputs, this parameter must be provided).
param dynamic_axes: Dict[str, Dict[int, str]] = None - If part of the input / output shape is dynamic, like (batch_size, 3, 32, 32) you can specify it by giving a dynamic axis to the input / output layer by its name as follows:

{
    "input layer name": {0: "batch_size"},
    "output layer name": {0: "batch_size"},
}

If provided, the ‘is_batched’ flag will be ignored. Defaulted to None.

is_batched: bool = True - Whether to include a batch size as the first axis in every input and output layer. Defaulted to True. Will be ignored if ‘dynamic_axes’ is provided.

1.2. Demo#

We will use the TF.Keras framework, a MobileNetV2 as our model and we will convert it to ONNX using the to_onnx handler.

1.2.1. First we will set a temporary artifact path for our model to be saved in and choose the models names:

import os
os.environ["TF_USE_LEGACY_KERAS"] = "true"
from tempfile import TemporaryDirectory

# Create a temporary directory for the model artifact:
ARTIFACT_PATH = TemporaryDirectory().name
os.makedirs(ARTIFACT_PATH)

# Choose our model's name:
MODEL_NAME = "mobilenetv2"

# Choose our ONNX version model's name:
ONNX_MODEL_NAME = "onnx_mobilenetv2"

# Choose our optimized ONNX version model's name:
OPTIMIZED_ONNX_MODEL_NAME = "optimized_onnx_mobilenetv2"

1.2.2. Download the model from keras.applications and log it with MLRun’s TFKerasModelHandler:

# mlrun: start-code

from tensorflow import keras

import mlrun
import mlrun.frameworks.tf_keras as mlrun_tf_keras


def get_model(context: mlrun.MLClientCtx, model_name: str):
    # Download the MobileNetV2 model:
    model = keras.applications.mobilenet_v2.MobileNetV2()

    # Initialize a model handler for logging the model:
    model_handler = mlrun_tf_keras.TFKerasModelHandler(
        model_name=model_name,
        model=model,
        context=context
    )

    # Log the model:
    model_handler.log()

# mlrun: end-code

1.2.3. Create the function using MLRun’s code_to_function and run it:

import mlrun


# Create the function parsing this notebook's code using 'code_to_function':
get_model_function = mlrun.code_to_function(
    name="get_mobilenetv2",
    kind="job",
    image="mlrun/ml-models"
)

# Run the function to log the model:
get_model_run = get_model_function.run(
    handler="get_model",
    artifact_path=ARTIFACT_PATH,
    params={
        "model_name": MODEL_NAME
    },
    local=True
)

1.2.4. Import the onnx_utils MLRun function and run it:

# Import the ONNX function from the marketplace:
onnx_utils_function = mlrun.import_function("hub://onnx_utils")

# Run the function to convert our model to ONNX:
to_onnx_run = onnx_utils_function.run(
    handler="to_onnx",
    artifact_path=ARTIFACT_PATH,
    params={
        "model_name": MODEL_NAME,
        "model_path": get_model_run.outputs[MODEL_NAME],  # <- Take the logged model from the previous function.
        "onnx_model_name": ONNX_MODEL_NAME,
        "optimize_model": False  # <- For optimizing it later in the demo, we mark the flag as False
    },
    local=True
)

1.2.5. Now, listing the artifact directory we will see both our tf.keras model and the onnx model:

import os

print(os.listdir(ARTIFACT_PATH))

2. optimize#

2.1. Docs#

Optimize the given ONNX model.

Parameters:#

context: mlrun.MLClientCtx - The MLRun function execution context
model_path: str - The model path store object.
optimizations: List[str] = None - List of possible optimizations. To see what optimizations are available, pass “help”. If None, all of the optimizations will be used. Defaulted to None.
fixed_point: bool = False - Optimize the weights using fixed point. Defaulted to False.
optimized_model_name: str = None - The name of the optimized model. If None, the original model will be overridden. Defaulted to None.

2.2. Demo#

We will use our converted model from the last example and optimize it.

2.2.1. We will call now the optimize handler:

onnx_utils_function.run(
    handler="optimize",
    artifact_path=ARTIFACT_PATH,
    params={
        "model_name": ONNX_MODEL_NAME,
        "model_path": to_onnx_run.output(ONNX_MODEL_NAME),  # <- Take the logged model from the previous function.
        "optimized_model_name": OPTIMIZED_ONNX_MODEL_NAME,
    },
    local=True
)

2.2.2. And now our model was optimized and can be seen under the ARTIFACT_PATH:

print(os.listdir(ARTIFACT_PATH))

Lastly, run this code to clean up the models:

import shutil

shutil.rmtree(ARTIFACT_PATH)

ONNX Utils

Contents

ONNX Utils#

1. to_onnx#

1.1. Docs#

Parameters:#

Supported keyword arguments (framework_kwargs) per framework:#

1.2. Demo#

2. optimize#

2.1. Docs#

Parameters:#

2.2. Demo#

Supported keyword arguments (`framework_kwargs`) per framework:#