Skip to main content

Voicelab Overview

Welcome to Voicelab! Vogent Voicelab is an API for optimized inference of top text-to-speech models, like Sesame’s CSM-1B, Dia, Orpheus, and more.

Why use Voicelab?

New open-source text-to-speech models come out every week, with many ranking as state-of-the-art on popular benchmarks. However, most of these models are not readily usable for high-volume, low-latency inference. Additionally, some research preview models can struggle with hallucinations and inconsistent outputs. Finally, as with any model, hosting yourself and managing compute can be a headache. Voicelab solves these problems:
  1. Voicelab maintains a proprietary inference stack that is optimized to serve text-to-speech transformers efficiently and scalably.
  2. Voicelab post-trains select models to improve consistency and offer high-quality professional voice clones.
  3. Voicelab manages all compute, so you can pay for these models per-character instead of managing GPUs.

Explore Voicelab

Quickstart

Get started with Voicelab in minutes - from API key to your first voice

API Reference

Complete API documentation for Vogent Voicelab

Models

Explore the available text-to-speech models and their capabilities

Voices

Explore existing voices or clone your own.

Fine-tuning and Hosting

Customize models for your specific use case and hosting requirements.

On-premise Deployment

Deploy Voicelab in your own infrastructure for maximum control.

Roadmap

See what’s coming next for Voicelab and upcoming features.

Support

For technical support, API questions, or feature requests, please contact our support team or join our Discord community.