Deepgram
Speech in, speech out, over one endpoint.
A gated Deepgram proxy your agent calls without ever holding a vendor key: nova-3 transcription, Aura-2 synthesis, and a Voice Agent that listens, thinks, and speaks in a single conversation. Tiered, metered, and managed end to end.
transcription
Hear every word, with timing
nova-3 turns English audio into text you can act on — words, confidence, and timestamps — whether you post a finished file or stream frames for live partials. One route covers both, so there is no driver to install and no key to babysit.
- Streaming partials over a socket
- Batch transcripts for stored audio
voice agent
A conversation on one socket
The Voice Agent fuses listen, think, and speak into a single connection. flux-general-en v2 handles the listen side, a Settings frame pins the linear16 or wav audio path up front, and Aura-2 answers back — no glue code between three separate services.
- flux-general-en v2 listen
- Settings frame audio config
pricing
Metered per call
Deepgram is a per-call upstream, so it bills from your shared fluid wallet rather than a monthly bucket. Session caps grow with your tier — see the pricing page. Two numbers to know:
sessions on plus & pro
30 min / stt & agent
failed calls
$0 / never billed
The Deepgram proxy fronts three model families behind a single application key: nova-3 for English transcription, Aura-2 for natural speech synthesis, and the Voice Agent for full conversational turns. You pick the endpoint; we handle the upstream credentials, headers, and routing.
Related
nova-3, Aura-2, Voice Agent, Single key
For more information, ask your agent.