Marvin can generate speech from text.
What it does
speak function generates audio from text. The
@speech decorator generates speech from the output of a function.
The easiest way to generate speech is to provide a string:
How it works
Marvin passes your prompt to the OpenAI speech API, which returns an audio file.
Speaking text verbatim¶
Unlike the images API, OpenAI's speech API does not modify or revise your input prompt in any way. Whatever text you provide is exactly what will be spoken.
Therefore, you can use the
speak function to generate speech from any string, or use the
@speech decorator to generate speech from the string output of any function.
By default, OpenAI generates speech from the text you provide, verbatim. We can use Marvin functions to generate more interesting speech by modifying the prompt before passing it to the speech API. For example, we can use a function to generate a line of dialogue that reflects a specific intent. And because of Marvin's modular design, we can simply add a
@speech decorator to the function to generate speech from its output.
Choosing a voice¶
@speech accept a
voice parameter that allows you to choose from a variety of voices. You can preview the available voices here.
You can pass parameters to the underlying API via the
model_kwargs arguments of
@speech. These parameters are passed directly to the respective APIs, so you can use any supported parameter.
If you are using Marvin in an async environment, you can use
speak_async (or decorate an async function with
@speech) to generate speech asynchronously: