Text to audio conversation generator
Contents
Text to audio conversation generator#
This function converts the text from a specified text file into speech and saves this as an audio file using the Bark library.
It’s designed to facilitate easy generation of speech from written transcripts.
Example Usage:#
import mlrun
# Import function
text_to_audio_generator_function = mlrun.import_function("hub://text_to_audio_generator")
# Run the function with desired text files
function_run = text_to_audio_generator_function.run(
handler="generate_multi_speakers_audio",
inputs={"data_path": "./test_data.txt"},
params={
"output_directory": "./out",
"speakers": {"Agent": 0, "Client": 1},
"available_voices": [
"alloy",
"echo",
],
"engine": "bark",
"file_format": "mp3",
# "bits_per_sample": 8,
},
local=True,
returns=[
"audio_files: path",
"audio_files_dataframe: dataset",
"text_to_speech_errors: file",
],
)
> 2023-12-04 14:08:48,769 [info] Storing function: {'name': 'text-to-audio-generator-generate-multi-speakers-audio', 'uid': 'ba017dfc11624de9afb5e148a6678a8b', 'db': 'http://mlrun-api:8080'}
torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
Generating: 100%|██████████| 1/1 [00:23<00:00, 23.74s/file]
> 2023-12-04 14:10:05,123 [info] Done (1/1)
Translations summary:
text_file audio_file
0 test_data.txt test_data.mp3
project | uid | iter | start | state | name | labels | inputs | parameters | results | artifacts |
---|---|---|---|---|---|---|---|---|---|---|
default | 0 | Dec 04 14:08:48 | completed | text-to-audio-generator-generate-multi-speakers-audio | v3io_user=yonis kind=local owner=yonis host=jupyter-yonis-7c9bdbfb4d-9g2p2 |
data_path |
output_directory=./out speakers={'Agent': 0, 'Client': 1} available_voices=['v2/en_speaker_0', 'v2/en_speaker_1'] use_small_models=True use_gpu=False offload_cpu=True file_format=mp3 |
audio_files audio_files_dataframe text_to_speech_errors |
> to track results use the .show() or .logs() methods or click here to open in UI
> 2023-12-04 14:10:05,486 [info] Run execution finished: {'status': 'completed', 'name': 'text-to-audio-generator-generate-multi-speakers-audio'}
function_run.artifact("audio_files_dataframe").show()
text_file | audio_file | |
---|---|---|
0 | test_data.txt | test_data.mp3 |
import IPython
IPython.display.Audio("./out/test_data.mp3")