# MLRun CI Example

Users may want to run their ML Pipelines using CI frameworks like Github Actions, GitLab CI/CD, etc. MLRun support simple and native integration with the CI systems, see the following example in which we combine local code (from the repository) with MLRun marketplace functions to build an automated ML pipeline which:

- Runs data preparation
- Train a model
- Test the trained model
- Deploy the model into a cluster
- Test the deployed model

In [1]:
%pip install python-dotenv

Collecting python-dotenv
  Downloading python_dotenv-0.17.1-py2.py3-none-any.whl (18 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-0.17.1
You should consider upgrading via the '/opt/conda/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


This example shows how to run an entire CI pipeline with notifications.
To run this example with Slack notifications, follow the instructions at <https://api.slack.com/messaging/webhooks> to create an app, and select the Incoming Webhooks feature, and click the Activate Incoming Webhooks toggle to switch it on.
Once you have a webhook URL, set `SLACK_WEBHOOK` environment variable in the `env.txt` file.

In [2]:
from dotenv import load_dotenv

load_dotenv('env.txt')

True

The code below performs the following steps:

- Ingest the iris data
- Train and test the model
- Deploy the model as a real-time serverless function

In [4]:
import json
from mlrun.utils import RunNotifications
import mlrun
from mlrun.platforms import auto_mount

project = "ci"
mlrun.set_environment(project=project)

# create notification object (console, Git, Slack as outputs) and push start message
notifier = RunNotifications(with_slack=True).print()


# Use the following line only when running inside Github actions or Gitlab CI.
# The `GITHUB_TOKEN` environment variable be set automatically in Github Actions
# When running from GitLab, set the `GIT_TOKEN` environment variable
#notifier.git_comment()

notifier.push_start_message(project)

# define and run a local data prep function
data_prep_func = mlrun.code_to_function("prep-data", filename="./functions/prep_data.py", kind="job",
                                        image="mlrun/mlrun", handler="prep_data").apply(auto_mount())

# Set the source-data URL
source_url = 'https://s3.wasabisys.com/iguazio/data/iris/iris.data.raw.csv'
prep_data_run = data_prep_func.run(name='prep_data', inputs={'source_url': source_url})

# train the model using a library (hub://) function and the generated data
train = mlrun.import_function('hub://sklearn_classifier').apply(auto_mount())
train_run = train.run(name='train',
                      inputs={'dataset': prep_data_run.outputs['cleaned_data']},
                      params={'model_pkg_class': 'sklearn.linear_model.LogisticRegression',
                              'label_column': 'label'})

# test the model using a library (hub://) function and the generated model
test = mlrun.import_function('hub://test_classifier').apply(auto_mount())
test_run = test.run(name="test",
                    params={"label_column": "label"},
                    inputs={"models_path": train_run.outputs['model'],
                            "test_set": train_run.outputs['test_set']})

# push results via notification to Git, Slack, ..
notifier.push_run_results([prep_data_run, train_run, test_run])

# Create model serving function using the new model
serve = mlrun.import_function('hub://v2_model_server').apply(auto_mount())
model_name = 'iris'
serve.add_model(model_name, model_path=train_run.outputs['model'])
addr = serve.deploy()

notifier.push(f"model {model_name} is deployed at {addr}")

# test the model serving function
inputs = [[5.1, 3.5, 1.4, 0.2],
          [7.7, 3.8, 6.7, 2.2]]
my_data = json.dumps({'inputs': inputs})
serve.invoke(f'v2/models/{model_name}/infer', my_data)

notifier.push(f"model {model_name} test passed Ok")

Pipeline started in project ci, check progress in http://localhost:30060/projects/ci/jobs


> 2021-05-24 01:05:22,539 [info] starting run prep_data uid=efc393357d8a4ae5bf8d7456b6b9cce0 DB=http://mlrun-api:8080
> 2021-05-24 01:05:22,694 [info] Job is running in the background, pod: prep-data-2rn2s
> 2021-05-24 01:05:31,730 [info] run executed, status=completed
final state: completed


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
ci,...b6b9cce0,0,May 24 01:05:30,completed,prep_data,kind=jobowner=jovyanhost=prep-data-2rn2s,source_url,,num_rows=150,cleaned_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run efc393357d8a4ae5bf8d7456b6b9cce0 --project ci , !mlrun logs efc393357d8a4ae5bf8d7456b6b9cce0 --project ci
> 2021-05-24 01:05:32,976 [info] run executed, status=completed
> 2021-05-24 01:05:33,595 [info] starting run train uid=01b7ae42329840cc9433f5a6eae47da8 DB=http://mlrun-api:8080
> 2021-05-24 01:05:33,719 [info] Job is running in the background, pod: train-7trd4
> 2021-05-24 01:05:43,457 [info] run executed, status=completed
final state: completed


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
ci,...eae47da8,0,May 24 01:05:41,completed,train,kind=jobowner=jovyanhost=train-7trd4class=sklearn.linear_model.LogisticRegression,dataset,model_pkg_class=sklearn.linear_model.LogisticRegressionlabel_column=label,accuracy=0.9375test-error=0.0625auc-micro=0.9921875auc-weighted=1.0f1-score=0.9206349206349206precision_score=0.9047619047619048recall_score=0.9555555555555556,test_setconfusion-matrixprecision-recall-multiclassroc-multiclassmodel


to track results use .show() or .logs() or in CLI: 
!mlrun get run 01b7ae42329840cc9433f5a6eae47da8 --project ci , !mlrun logs 01b7ae42329840cc9433f5a6eae47da8 --project ci
> 2021-05-24 01:05:45,420 [info] run executed, status=completed
> 2021-05-24 01:05:45,856 [info] starting run test uid=6118eedd515048c9a10adffe1f9d19eb DB=http://mlrun-api:8080
> 2021-05-24 01:05:45,968 [info] Job is running in the background, pod: test-hn4xg
> 2021-05-24 01:05:51,291 [info] run executed, status=completed
final state: completed


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
ci,...1f9d19eb,0,May 24 01:05:50,completed,test,kind=jobowner=jovyanhost=test-hn4xg,models_pathtest_set,label_column=label,accuracy=0.9777777777777777test-error=0.022222222222222223auc-micro=0.9985185185185185auc-weighted=0.9985392720306513f1-score=0.9769016328156113precision_score=0.9761904761904763recall_score=0.9791666666666666,confusion-matrixprecision-recall-multiclassroc-multiclasstest_set_preds


to track results use .show() or .logs() or in CLI: 
!mlrun get run 6118eedd515048c9a10adffe1f9d19eb --project ci , !mlrun logs 6118eedd515048c9a10adffe1f9d19eb --project ci
> 2021-05-24 01:05:53,282 [info] run executed, status=completed
pipeline run finished
status     name       uid       results
---------  ---------  --------  -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
completed  prep_data  ..b9cce0  num_rows=150
completed  train      ..e47da8  accuracy=0.9375,test-error=0.0625,auc-micro=0.9921875,auc-weighted=1.0,f1-score=0.9206349206349206,precision_score=0.9047619047619048,recall_score=0.9555555555555556
completed  test       ..9d19eb  accuracy=0.9777777777777777,test-error=0.022222222222222223,auc-micro=0.9985185185185185,auc-weighted=0.9985392720306513,f1-score=0.9769016328156113,precision_score=0.97619047619

uid,start,state,name,results,artifacts
...b6b9cce0,May 24 01:05:30,completed,prep_data,num_rows=150,cleaned_data
...eae47da8,May 24 01:05:41,completed,train,accuracy=0.9375test-error=0.0625auc-micro=0.9921875auc-weighted=1.0f1-score=0.9206349206349206precision_score=0.9047619047619048recall_score=0.9555555555555556,test_setconfusion-matrixprecision-recall-multiclassroc-multiclassmodel
...1f9d19eb,May 24 01:05:50,completed,test,accuracy=0.9777777777777777test-error=0.022222222222222223auc-micro=0.9985185185185185auc-weighted=0.9985392720306513f1-score=0.9769016328156113precision_score=0.9761904761904763recall_score=0.9791666666666666,confusion-matrixprecision-recall-multiclassroc-multiclasstest_set_preds


> 2021-05-24 01:05:54,489 [info] Starting remote function deploy
2021-05-24 01:05:54  (info) Deploying function
2021-05-24 01:05:54  (info) Building
2021-05-24 01:05:54  (info) Staging files and preparing base images
2021-05-24 01:05:54  (info) Building processor image
2021-05-24 01:06:34  (info) Build complete
2021-05-24 01:06:44  (info) Function deploy complete
> 2021-05-24 01:06:45,777 [info] function deployed, address=192.168.65.4:30843
model iris is deployed at http://192.168.65.4:30843


model iris test passed Ok


## Done
With a few lines of code we have successfully ran 