Browse the voice library and learn about voice cloning options
While you can bring any voice to any of our offered models by voice-cloning, we also offer select pre-cloned and pre-trained voices for each model. Voices marked as High Quality are full fine-tunes from professional voice actors, and will be higher quality & more stable. Other voices are zero-shot voice clones; while still high-quality, they may be more likely to produce artifacts in generation.
Create custom voices from audio samples for unique, branded voice experiences.
There are two ways to clone a voice: zero-shot voice-cloning involves providing 15-20 seconds of a high-quality recording, while professional voice-cloning (or fine-tuning) involves 2-3 hours of two speakers engaged in conversation.
While professional voice-cloning produces higher-fidelity clones with fewer quirks/artifacts, zero-shot voice cloning can usually produce a good result. The quality of voice clones can depend on the style of the voice you’re trying to clone; voices that are too far out-of-distribution (i.e. voices whose accents/styles were not likely encountered during initial training of the model) are more likely to produce lower-fidelity voice clones, or voice clones with more artifacts.
You can generate a zero-shot voice clone by uploading a 15-20 second voice recording through our UI.
To do so, go to the Voices tab on the left sidebar, then click the Clone Voice button. Select a gender and voice model, then upload the clip.
For more information on this, please visit the fine-tuning guide
Browse the voice library and learn about voice cloning options
While you can bring any voice to any of our offered models by voice-cloning, we also offer select pre-cloned and pre-trained voices for each model. Voices marked as High Quality are full fine-tunes from professional voice actors, and will be higher quality & more stable. Other voices are zero-shot voice clones; while still high-quality, they may be more likely to produce artifacts in generation.
Create custom voices from audio samples for unique, branded voice experiences.
There are two ways to clone a voice: zero-shot voice-cloning involves providing 15-20 seconds of a high-quality recording, while professional voice-cloning (or fine-tuning) involves 2-3 hours of two speakers engaged in conversation.
While professional voice-cloning produces higher-fidelity clones with fewer quirks/artifacts, zero-shot voice cloning can usually produce a good result. The quality of voice clones can depend on the style of the voice you’re trying to clone; voices that are too far out-of-distribution (i.e. voices whose accents/styles were not likely encountered during initial training of the model) are more likely to produce lower-fidelity voice clones, or voice clones with more artifacts.
You can generate a zero-shot voice clone by uploading a 15-20 second voice recording through our UI.
To do so, go to the Voices tab on the left sidebar, then click the Clone Voice button. Select a gender and voice model, then upload the clip.
For more information on this, please visit the fine-tuning guide