V2 Model Server (SKLearn)
Contents
V2 Model Server (SKLearn)#
Test one or more classifier models against held-out dataset.
Using held-out test features, evaluates the peformance of the estimated model.
Can be part of a kubeflow pipeline as a test step that is run post EDA and training/validation cycles.
This function is part of the scikit-learn-pipeline demo.
To see how the model is trained or how the data-set is generated, check out sklearn_classifier
function from the function marketplace repository
Steps#
Setup function parameters
Importing the function
Testing the function locally
Testing the function remotely
import warnings
warnings.filterwarnings("ignore")
Setup function parameters#
data_path = 'https://s3.wasabisys.com/iguazio/data/function-marketplace-data/sklearn_classifier/iris_dataset.csv'
models_path = 'https://s3.wasabisys.com/iguazio/models/function-marketplace-models/test_classifier/RandomForestClassifier.pkl'
Importing the function#
import mlrun
mlrun.set_environment(project='function-marketplace')
# Importing the function from the hub
fn = mlrun.import_function("hub://v2_model_server")
fn.apply(mlrun.auto_mount())
# Adding the model
fn.add_model(key='RandomForestClassifier', model_path=models_path ,class_name='ClassifierModel')
> 2021-10-17 14:04:23,167 [info] loaded project function-marketplace from MLRun DB
<mlrun.serving.states.TaskStep at 0x7f95f58e5f50>
Testing the function locally#
Test against the iris dataset
# When mocking, class has to be present
from v2_model_server import *
# Mocking function
server = fn.to_mock_server()
> 2021-10-17 14:04:26,871 [info] model RandomForestClassifier was loaded
> 2021-10-17 14:04:26,872 [info] Initializing endpoint records
> 2021-10-17 14:04:26,899 [info] Loaded ['RandomForestClassifier']
# Getting the data
import pandas as pd
iris_dataset = pd.read_csv(data_path)
iris_dataset.head()
sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | label | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | 0 |
1 | 4.9 | 3.0 | 1.4 | 0.2 | 0 |
2 | 4.7 | 3.2 | 1.3 | 0.2 | 0 |
3 | 4.6 | 3.1 | 1.5 | 0.2 | 0 |
4 | 5.0 | 3.6 | 1.4 | 0.2 | 0 |
# KFServing protocol event
event_data = {"inputs": iris_dataset.drop(['label'],axis=1).values.tolist()}
response = server.test(path='/v2/models/RandomForestClassifier/predict',body=event_data)
print(f'When mocking to server, returned dict has the following fields : {", ".join([x for x in response.keys()])}')
print(f"model's accuracy { sum(1 for x,y in zip(iris_dataset['label'],response['outputs']) if x == y) / len(response['outputs'])}")
When mocking to server, returned dict has the following fields : id, model_name, outputs
model's accuracy 0.9733333333333334
Testing the function remotely#
address = fn.deploy()
> 2021-10-17 14:04:27,617 [info] Starting remote function deploy
2021-10-17 14:04:27 (info) Deploying function
2021-10-17 14:04:27 (info) Building
2021-10-17 14:04:27 (info) Staging files and preparing base images
2021-10-17 14:04:27 (info) Building processor image
2021-10-17 14:04:29 (info) Build complete
> 2021-10-17 14:04:39,180 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-function-marketplace-v2-model-server.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['default-tenant.app.dev39.lab.iguazeng.com:31003']}
import json
import requests
# Made up data
my_data = '''{"inputs":[[5.1, 3.5, 1.4, 0.2],[7.7, 3.8, 6.7, 2.2]]}'''
# using requests to predict
response = requests.put(address + "/v2/models/RandomForestClassifier/predict", json=json.dumps(my_data))
response.text
'{"id": "ac6be063-b05f-4276-972b-5e0acb96dfd9", "model_name": "RandomForestClassifier", "outputs": [0, 2]}'
Back to the top