Speech diarization example notebook
Speech diarization example notebook#
In this notebook we will utilize a call diarization capability to get per-speaker speech durations from a call recording.
This can be useful for quantifying participation rates in calls for things like customer service analysis.
We will demonstrate this by:
Loading in a sample call recording between multiple participants
Using a diarize() function to automatically detect speakers and estimate per-speaker talk time
Return a dictionary of described results, and a df of errors
import os
import mlrun
# To use the `pyannote.audio` models you must pass a Huggingface token and get access to the required models. The
# token can be passed in one of the following options:
#
# * Use the parameter `access_token`.
# * Set an environment variable named "HUGGING_FACE_HUB_TOKEN".
# * If using MLRun, you can pass it as a secret named "HUGGING_FACE_HUB_TOKEN".
os.environ["HUGGING_FACE_HUB_TOKEN"] = <"add your token here">
# Create an mlrun project
project = mlrun.get_or_create_project("diarization-test")
# Import the function from the yaml file, once it's in the the we can import from there
speech_diarization = project.set_function(func="hub://speech_diarization", name="speech_diarization")
> 2023-12-05 15:28:51,758 [info] Project loaded successfully: {'project_name': 'diarization-test'}
# Set the desired run params and files
audio_files = os.path.join("test_data.wav")
device = "cpu"
speakers_labels = ["Agent", "Client"]
separate_by_channels = True
# Run the imported function with desired file/s and params
diarize_run = speech_diarization.run(
handler="diarize",
inputs={"data_path": audio_files},
params={
"device": device,
"speakers_labels": speakers_labels,
"separate_by_channels": separate_by_channels,
},
returns=["speech-diarization: file", "diarize-errors: file"],
local=True,
)
> 2023-12-05 15:28:52,229 [info] Storing function: {'name': 'speech-diarization-diarize', 'uid': 'ec6cd014e4674966b30303ea14048acf', 'db': 'http://mlrun-api:8080'}
project | uid | iter | start | state | name | labels | inputs | parameters | results | artifacts |
---|---|---|---|---|---|---|---|---|---|---|
diarization-test | 0 | Dec 05 15:28:52 | completed | speech-diarization-diarize | v3io_user=zeevr kind=local owner=zeevr host=jupyter-zeev-gpu-5995df47dc-rtpvr |
data_path |
device=cpu speakers_labels=['Agent', 'Client'] separate_by_channels=True |
speech-diarization diarize-errors |
> to track results use the .show() or .logs() methods or click here to open in UI
> 2023-12-05 15:28:53,350 [info] Run execution finished: {'status': 'completed', 'name': 'speech-diarization-diarize'}