Hugging Face 🤗 Serving#

import mlrun

Importing the Hugging Face 🤗 model serving function#

serving_function = mlrun.import_function('function.yaml')

Adding a pretrained model#

serving_function.add_model(
    'mymodel',
    class_name='HuggingFaceModelServer',
    model_path='123',  # This is not used, just for enabling the process.
    
    task="sentiment-analysis",
    model_class="AutoModelForSequenceClassification",
    model_name="nlptown/bert-base-multilingual-uncased-sentiment",
    tokenizer_class="AutoTokenizer",
    tokenizer_name="nlptown/bert-base-multilingual-uncased-sentiment",
)
<mlrun.serving.states.TaskStep at 0x7fc3ec3a7a50>

Testing the pipeline locally#

server = serving_function.to_mock_server()
> 2022-09-07 08:54:42,419 [info] model mymodel was loaded
> 2022-09-07 08:54:42,420 [info] Loaded ['mymodel']
result = server.test(
    '/v2/models/mymodel',
    body={"inputs": ["Nous sommes très heureux de vous présenter la bibliothèque 🤗 Transformers."]}
)
print(f"prediction: {result['outputs']}")
prediction: [{'label': '5 stars', 'score': 0.7272651791572571}]

Adding a default model from 🤗#

serving_function.add_model(
    'default-model',
    class_name='HuggingFaceModelServer',
    model_path='123',  # This is not used, just for enabling the process.
    
    task="sentiment-analysis",
    framework='pt', # Use `pt` for pytorch and `tf` for tensorflow.
)
<mlrun.serving.states.TaskStep at 0x7fc2d3472f10>

Deploy the pipeline to our k8s cluster#

serving_function.deploy()
> 2022-09-07 08:54:42,487 [info] Starting remote function deploy
2022-09-07 08:54:43  (info) Deploying function
2022-09-07 08:54:43  (info) Building
2022-09-07 08:54:44  (info) Staging files and preparing base images
2022-09-07 08:54:44  (info) Building processor image
2022-09-07 08:56:29  (info) Build complete
> 2022-09-07 08:57:09,536 [info] successfully deployed function: {'internal_invocation_urls': ['nuclio-default-hugging-face-serving.default-tenant.svc.cluster.local:8080'], 'external_invocation_urls': ['default-hugging-face-serving-default.default-tenant.app.yh43.iguazio-cd1.com/']}
'http://default-hugging-face-serving-default.default-tenant.app.yh43.iguazio-cd1.com/'

Infer our sentences through our model#

serving_function.invoke(
    path='v2/models/default-model/predict',
    body={"inputs": ["We are delighted that we can serve 🤗 Transformers with MLRun."]})
> 2022-09-07 08:57:09,616 [info] invoking function: {'method': 'POST', 'path': 'http://nuclio-default-hugging-face-serving.default-tenant.svc.cluster.local:8080/v2/models/default-model/predict'}
{'id': 'f7753a17-fa84-44fa-9264-1dc65172d05c',
 'model_name': 'default-model',
 'outputs': [{'label': 'POSITIVE', 'score': 0.9993784427642822}]}