Neuphonic
Neuphonic is an ultra-low-latency voice AI platform specializing in real-time text-to-speech synthesis with sub-25ms latency, making it suitable for conversational AI and live applications. The platform provides both a cloud-hosted API with WebSocket streaming and Server-Sent Events (SSE), as well as open-source on-device models (NeuTTS Air, NeuTTS Nano) that run without a GPU. Neuphonic supports nine languages including English, Spanish, German, French, Urdu, Japanese, Korean, Chinese, and Portuguese, and offers instant voice cloning from short audio samples. Developers can also build conversational AI agents via the Agent API, which integrates with GPT-4o and supports Model Context Protocol (MCP) servers. Authentication uses API keys passed via the X-API-KEY header for SSE and as a query parameter for WebSocket connections.
APIs
Neuphonic TTS SSE API
Server-Sent Events endpoint for real-time text-to-speech synthesis. Accepts POST requests with text content and returns streaming audio in PCM format. Supports language selectio...
Neuphonic TTS WebSocket API
WebSocket endpoint for continuous, low-latency text-to-speech streaming. Enables real-time voice synthesis with sub-25ms latency, supporting multiple text chunks over a single p...
Neuphonic Voice Cloning API
REST API for creating and managing cloned voices. Accepts audio samples (MP3 or WAV, minimum 6 seconds, under 10MB) and generates a custom voice model. Supports creating, retrie...
Neuphonic Agent API
REST API for creating and managing conversational AI voice agents. Agents combine Neuphonic TTS with GPT-4o for interactive voice applications and support Model Context Protocol...