Generating speech¶
Marvin can generate speech from text.
What it does
The speak
function generates audio from text. The @speech
decorator generates speech from the output of a function.
Example
The easiest way to generate speech is to provide a string:
How it works
Marvin passes your prompt to the OpenAI speech API, which returns an audio file.
Speaking text verbatim¶
Unlike the images API, OpenAI's speech API does not modify or revise your input prompt in any way. Whatever text you provide is exactly what will be spoken.
Therefore, you can use the speak
function to generate speech from any string, or use the @speech
decorator to generate speech from the string output of any function.
Generating speech¶
By default, OpenAI generates speech from the text you provide, verbatim. We can use Marvin functions to generate more interesting speech by modifying the prompt before passing it to the speech API. For example, we can use a function to generate a line of dialogue that reflects a specific intent. And because of Marvin's modular design, we can simply add a @speech
decorator to the function to generate speech from its output.
import marvin
@marvin.speech
@marvin.fn
def ai_say(intent: str) -> str:
'''
Given an `intent`, generate a line of diagogue that
reflects the intent / tone / instruction without repeating
it verbatim.
'''
ai_say('hello')
# Hi there! Nice to meet you.
Result
Choosing a voice¶
Both speak
and @speech
accept a voice
parameter that allows you to choose from a variety of voices. You can preview the available voices here.
## Saving audio files
The result of the `speak` function and `@speech` decorator is an audio stream. You can save this stream to disk like this:
```python
audio = marvin.speak("Hello, world!")
audio.stream_to_file("hello_world.mp3")
Model parameters¶
You can pass parameters to the underlying API via the model_kwargs
arguments of speak
and @speech
. These parameters are passed directly to the respective APIs, so you can use any supported parameter.
Async support¶
If you are using Marvin in an async environment, you can use speak_async
(or decorate an async function with @speech
) to generate speech asynchronously: