RunPod
RunPod is a managed GPU cloud and serverless inference platform offering on-demand and persistent GPU Pods, autoscaling Serverless endpoints, network volumes, container templates, and a REST + GraphQL control plane for provisioning H100, H200, B200, A100, L40S, and consumer RTX GPUs. RunPod targets AI/ML developers who need flexible, per-second-billed GPU compute for training, fine-tuning, and inference workloads.
RunPod publishes 1 API on the APIs.io network: REST API. Tagged areas include AI, Cloud, Compute, GPU, and Inference.
RunPod’s developer surface includes documentation, developer portal, signup flow, pricing, engineering blog, support, changelog, and 10 more developer resources.
APIs
RunPod REST API
The RunPod REST API programmatically manages Pods, Serverless endpoints, network volumes, templates, container registry auth, and billing. The API is the primary control plane f...
RunPod GraphQL API
The RunPod GraphQL API provides programmatic access to Pods, templates, and Serverless endpoints via GraphQL queries and mutations. It is the original control-plane interface an...
RunPod Serverless
RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based and load-balanced endpoint types, FlashBoot cold-start optimization, and per-s...
Features
Persistent on-demand GPU instances with SSH, JupyterLab, and VSCode access, billed per-second across a wide range of NVIDIA SKUs.
Autoscaling, queue-based inference endpoints with FlashBoot cold-start optimization and pay-per-request billing.
Persistent, portable storage that can be attached to Pods and Serverless workers across datacenters.
Reusable Pod and endpoint configurations bundling container images, hardware specs, and network settings.
Pre-built Serverless workers for deploying open-source LLMs with vLLM in a single click.
Integrations
Bring-your-own container support for any Docker image on Pods and Serverless workers.
Direct deployment of Hugging Face models via vLLM Quick Deploy and ready-made templates.
Infrastructure-as-code provisioning of RunPod resources via the official Pulumi provider.