Text to audio conversation generator#

This function converts the text from a specified text file into speech and saves this as an audio file using the Bark library.
It’s designed to facilitate easy generation of speech from written transcripts.

Example Usage:#

import mlrun

# Import function 
text_to_audio_generator_function = mlrun.import_function("hub://text_to_audio_generator")

# Run the function with desired text files
function_run = text_to_audio_generator_function.run(
    handler="generate_multi_speakers_audio",
    inputs={"data_path": "./test_data.txt"},
    params={
        "output_directory": "./out",
        "speakers": {"Agent": 0, "Client": 1},
        "available_voices": [
           "alloy",
            "echo",
        ],
        "engine": "bark",
        "file_format": "mp3",
        # "bits_per_sample": 8,
    },
    local=True,
    returns=[
        "audio_files: path",
        "audio_files_dataframe: dataset",
        "text_to_speech_errors: file",
    ],
)

> 2023-12-04 14:08:48,769 [info] Storing function: {'name': 'text-to-audio-generator-generate-multi-speakers-audio', 'uid': 'ba017dfc11624de9afb5e148a6678a8b', 'db': 'http://mlrun-api:8080'}

torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
Generating: 100%|██████████| 1/1 [00:23<00:00, 23.74s/file]

> 2023-12-04 14:10:05,123 [info] Done (1/1)
Translations summary:
       text_file     audio_file
0  test_data.txt  test_data.mp3

project	uid	iter	start	state	name	labels	inputs	parameters	results	artifacts
default	...a6678a8b	0	Dec 04 14:08:48	completed	text-to-audio-generator-generate-multi-speakers-audio	v3io_user=yonis kind=local owner=yonis host=jupyter-yonis-7c9bdbfb4d-9g2p2	data_path	output_directory=./out speakers={'Agent': 0, 'Client': 1} available_voices=['v2/en_speaker_0', 'v2/en_speaker_1'] use_small_models=True use_gpu=False offload_cpu=True file_format=mp3		audio_files audio_files_dataframe text_to_speech_errors

> to track results use the .show() or .logs() methods or click here to open in UI

> 2023-12-04 14:10:05,486 [info] Run execution finished: {'status': 'completed', 'name': 'text-to-audio-generator-generate-multi-speakers-audio'}

function_run.artifact("audio_files_dataframe").show()

	text_file	audio_file
0	test_data.txt	test_data.mp3

import IPython

IPython.display.Audio("./out/test_data.mp3")

Text to audio conversation generator

Contents

Text to audio conversation generator#

Example Usage:#