Text to audio conversation generator#

This function converts the text from a specified text file into speech and saves this as an audio file using the Bark library.
It’s designed to facilitate easy generation of speech from written transcripts.

Example Usage:#

import mlrun
# Import function 
text_to_audio_generator_function = mlrun.import_function("hub://text_to_audio_generator")
# Run the function with desired text files
function_run = text_to_audio_generator_function.run(
    handler="generate_multi_speakers_audio",
    inputs={"data_path": "./test_data.txt"},
    params={
        "output_directory": "./out",
        "speakers": {"Agent": 0, "Client": 1},
        "available_voices": [
           "alloy",
            "echo",
        ],
        "engine": "bark",
        "file_format": "mp3",
        # "bits_per_sample": 8,
    },
    local=True,
    returns=[
        "audio_files: path",
        "audio_files_dataframe: dataset",
        "text_to_speech_errors: file",
    ],
)
> 2023-12-04 14:08:48,769 [info] Storing function: {'name': 'text-to-audio-generator-generate-multi-speakers-audio', 'uid': 'ba017dfc11624de9afb5e148a6678a8b', 'db': 'http://mlrun-api:8080'}
torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
Generating: 100%|██████████| 1/1 [00:23<00:00, 23.74s/file]
> 2023-12-04 14:10:05,123 [info] Done (1/1)
Translations summary:
       text_file     audio_file
0  test_data.txt  test_data.mp3

project uid iter start state name labels inputs parameters results artifacts
default 0 Dec 04 14:08:48 completed text-to-audio-generator-generate-multi-speakers-audio
v3io_user=yonis
kind=local
owner=yonis
host=jupyter-yonis-7c9bdbfb4d-9g2p2
data_path
output_directory=./out
speakers={'Agent': 0, 'Client': 1}
available_voices=['v2/en_speaker_0', 'v2/en_speaker_1']
use_small_models=True
use_gpu=False
offload_cpu=True
file_format=mp3
audio_files
audio_files_dataframe
text_to_speech_errors

> to track results use the .show() or .logs() methods or click here to open in UI
> 2023-12-04 14:10:05,486 [info] Run execution finished: {'status': 'completed', 'name': 'text-to-audio-generator-generate-multi-speakers-audio'}
function_run.artifact("audio_files_dataframe").show()
text_file audio_file
0 test_data.txt test_data.mp3
import IPython

IPython.display.Audio("./out/test_data.mp3")