ONNX Utils#
A collection of ONNX utils in one MLRun function. The function includes the following handlers:
to_onnx - Convert your model into
onnx
format.optimize - Perform ONNX optimizations using
onnxmodeloptimizer
on a given ONNX model.
1. to_onnx#
1.1. Docs#
Convert the given model to an ONNX model.
Parameters:#
context
:mlrun.MLClientCtx
- The MLRun function execution contextmodel_path
:str
- The model path store object.onnx_model_name
:str = None
- The name to use to log the converted ONNX model. If not given, the givenmodel_name
will be used with an additional suffix_onnx
. Defaulted to None.optimize_model
:bool = True
- Whether to optimize the ONNX model using ‘onnxoptimizer’ before saving the model. Defaulted to True.framework
:str = None
- The model’s framework. If None, it will be read from the ‘framework’ label of the model artifact provided. Defaulted to None.framework_kwargs
:Dict[str, Any] = None
- Additional arguments each framework may require in order to convert to ONNX. To get the doc string of the desired framework onnx conversion function, pass “help”.
Supported keyword arguments (framework_kwargs
) per framework:#
tensorflow.keras
:
input_signature
:List[Tuple[Tuple[int], str]] = None
- A list of the input layers shape and data type properties. Expected to receive a list where each element is an input layer tuple. An input layer tuple is a tuple of:[0] = Layer’s shape, a tuple of integers.
[1] = Layer’s data type, a mlrun.data_types.ValueType string.
If None, the input signature will be tried to be read automatically before converting to ONNX or from the model artifact if available. Defaulted to None.
torch
:
input_signature
:List[Tuple[Tuple[int], str]] = None
- A list of the input layers shape and data type properties. Expected to receive a list where each element is an input layer tuple. An input layer tuple is a tuple of:[0] = Layer’s shape, a tuple of integers.
[1] = Layer’s data type, a mlrun.data_types.ValueType string.
If None, the input signature will be read from the model artifact if available. Defaulted to None.
input_layers_names
:List[str] = None
- List of names to assign to the input nodes of the graph in order. All of the other parameters (inner layers) can be set as well by passing additional names in the list. The order is by the order of the parameters in the model. If None, the inputs will be read from the handler’s inputs. If its also None, it is defaulted to: “input_0”, “input_1”, …output_layers_names
:List[str] = None
- List of names to assign to the output nodes of the graph in order. If None, the outputs will be read from the handler’s outputs. If its also None, it is defaulted to: “output_0” (for multiple outputs, this parameter must be provided).param dynamic_axes
:Dict[str, Dict[int, str]] = None
- If part of the input / output shape is dynamic, like (batch_size, 3, 32, 32) you can specify it by giving a dynamic axis to the input / output layer by its name as follows:
{
"input layer name": {0: "batch_size"},
"output layer name": {0: "batch_size"},
}
If provided, the ‘is_batched’ flag will be ignored. Defaulted to None.
is_batched
:bool = True
- Whether to include a batch size as the first axis in every input and output layer. Defaulted to True. Will be ignored if ‘dynamic_axes’ is provided.
1.2. Demo#
We will use the TF.Keras
framework, a MobileNetV2
as our model and we will convert it to ONNX using the to_onnx
handler.
1.2.1. First we will set a temporary artifact path for our model to be saved in and choose the models names:
import os
os.environ["TF_USE_LEGACY_KERAS"] = "true"
from tempfile import TemporaryDirectory
# Create a temporary directory for the model artifact:
ARTIFACT_PATH = TemporaryDirectory().name
os.makedirs(ARTIFACT_PATH)
# Choose our model's name:
MODEL_NAME = "mobilenetv2"
# Choose our ONNX version model's name:
ONNX_MODEL_NAME = "onnx_mobilenetv2"
# Choose our optimized ONNX version model's name:
OPTIMIZED_ONNX_MODEL_NAME = "optimized_onnx_mobilenetv2"
1.2.2. Download the model from keras.applications
and log it with MLRun’s TFKerasModelHandler
:
# mlrun: start-code
from tensorflow import keras
import mlrun
import mlrun.frameworks.tf_keras as mlrun_tf_keras
def get_model(context: mlrun.MLClientCtx, model_name: str):
# Download the MobileNetV2 model:
model = keras.applications.mobilenet_v2.MobileNetV2()
# Initialize a model handler for logging the model:
model_handler = mlrun_tf_keras.TFKerasModelHandler(
model_name=model_name,
model=model,
context=context
)
# Log the model:
model_handler.log()
# mlrun: end-code
1.2.3. Create the function using MLRun’s code_to_function
and run it:
import mlrun
# Create the function parsing this notebook's code using 'code_to_function':
get_model_function = mlrun.code_to_function(
name="get_mobilenetv2",
kind="job",
image="mlrun/ml-models"
)
# Run the function to log the model:
get_model_run = get_model_function.run(
handler="get_model",
artifact_path=ARTIFACT_PATH,
params={
"model_name": MODEL_NAME
},
local=True
)
1.2.4. Import the onnx_utils
MLRun function and run it:
# Import the ONNX function from the marketplace:
onnx_utils_function = mlrun.import_function("hub://onnx_utils")
# Run the function to convert our model to ONNX:
to_onnx_run = onnx_utils_function.run(
handler="to_onnx",
artifact_path=ARTIFACT_PATH,
params={
"model_name": MODEL_NAME,
"model_path": get_model_run.outputs[MODEL_NAME], # <- Take the logged model from the previous function.
"onnx_model_name": ONNX_MODEL_NAME,
"optimize_model": False # <- For optimizing it later in the demo, we mark the flag as False
},
local=True
)
1.2.5. Now, listing the artifact directory we will see both our tf.keras
model and the onnx
model:
import os
print(os.listdir(ARTIFACT_PATH))
2. optimize#
2.1. Docs#
Optimize the given ONNX model.
Parameters:#
context
:mlrun.MLClientCtx
- The MLRun function execution contextmodel_path
:str
- The model path store object.optimizations
:List[str] = None
- List of possible optimizations. To see what optimizations are available, pass “help”. If None, all of the optimizations will be used. Defaulted to None.fixed_point
:bool = False
- Optimize the weights using fixed point. Defaulted to False.optimized_model_name
:str = None
- The name of the optimized model. If None, the original model will be overridden. Defaulted to None.
2.2. Demo#
We will use our converted model from the last example and optimize it.
2.2.1. We will call now the optimize
handler:
onnx_utils_function.run(
handler="optimize",
artifact_path=ARTIFACT_PATH,
params={
"model_name": ONNX_MODEL_NAME,
"model_path": to_onnx_run.output(ONNX_MODEL_NAME), # <- Take the logged model from the previous function.
"optimized_model_name": OPTIMIZED_ONNX_MODEL_NAME,
},
local=True
)
2.2.2. And now our model was optimized and can be seen under the ARTIFACT_PATH
:
print(os.listdir(ARTIFACT_PATH))
Lastly, run this code to clean up the models:
import shutil
shutil.rmtree(ARTIFACT_PATH)