D-ID
D-ID is an AI-powered platform for generating talking-head videos and interactive digital human experiences from still photos, text scripts, and audio narration. Developers can access REST APIs to produce pre-rendered talking avatar videos, perform real-time streaming agent conversations via WebRTC, translate videos into 100+ languages with voice cloning and lip-sync, and create custom AI agents backed by LLMs and RAG knowledge bases. The platform has generated over 150 million videos and supports parallel processing of tens of thousands of simultaneous API requests. D-ID serves use cases ranging from enterprise training and customer service to personalized marketing campaigns and language learning applications.
APIs
D-ID Videos API
REST API for generating AI talking-head videos from a source image combined with a text script or audio file. Supports multiple avatar versions including V4 Expressive (full-HD ...
D-ID Agents API
API for creating and managing interactive real-time AI agents that combine digital avatar streaming with large language models, RAG-based knowledge bases, and custom tools. Supp...
D-ID Translations API
API for translating existing videos into 100+ languages using AI-driven speech translation, voice cloning, and lip-sync technology. Enables brands and content creators to locali...