Key Features
- Natural Prosody: Sesame voices deliver more natural intonation, rhythm, and stress patterns in speech
- Improved Expressiveness: Better emotional range and contextual understanding
- Enhanced Pronunciation and Spelling: More accurate handling of complex words and phrases
- Seamless Transitions: Smoother flow between sentences and paragraphs
Using Sesame Voices
Sesame voices can be identified by the “Sesame” tag in the voice selection interface. While there are a small number of available Sesame voices right now, cloning Sesame voices is straightforward, and can be done with ~8-20 seconds of audio. For tips on effectively creating new Sesame voices, see the Voice Cloning section.Sesame voices are still in beta, and may still have instability in inference (e.g. long pauses, or strange conversational artifacts). We regularly release updates that enhance their capabilities and performance.