Vogent is a platform for building, testing, and deploying conversational voice AI agents. We provide you with all of the off-the-shelf building blocks you need, while also including our own models and abstractions to make your agents more humanlike, low-latency, and performant.
Join our Discord to get help from the Vogent team, shape our roadmap, and request early access to new features.
There are effective off-the-shelf solutions for many of the building blocks of voice agents, like live transcription engines, LLMs, and low-latency text-to-speech. Building an effective voice agent, though, involves a number of additional pieces:
Robust voice-activity detection (VAD) to detect when a user has finished speaking and when to start speaking
Rigorously handling structured, large-context conversations like surveys
Detecting and navigating IVR menus and phone trees
Dealing with interruptions and overlapping speech
Managing latency and response timing to feel natural
Evaluating conversations for correctness and quality
Surfacing issues from large corpuses of calls
Experimenting with new configurations and backtesting on past conversations
Voice AI evolves quickly, with better models and more effective evaluation techniques coming out every week. With Vogent, you can experiment with the latest solutions immediately, instead of having to sift through the noise and implement them yourself.
Voice AI is a growing space, and it’s exciting that there are different ways to build productive solutions; some may be better fits than others, based on your background and goals.With Vogent, we try to offer the best solution for creating rigorous, effective voice agents. Building a voice agent that wows on a demo is easy, but maintaining a voice agent that consistently produces results on millions of dials involves more sophisticated building blocks and a more transparent evaluation loop.
In addition to table-stakes features for building and managing voice agents, Vogent provides features like:
IVR/phone-tree detection and navigation models
In-house, ultra-low-latency voices
In-house, ultra-low-latency LLMs tuned on millions of conversations
Fine-tuning on call recordings and transcripts
A survey-builder that uses multiple models to handle structured conversations accurately, regardless of the number of questions
Everyone’s needs are different, and no one platform can do it all. We pride ourselves on working closely with each of our customers to help them build solutions that work. If there’s anything you need to make your voice agent work, just let us know in the in-app chat widget, ping us on Discord, or reach out to us over email.