RunPod

RunPod is a managed GPU cloud and serverless inference platform offering on-demand and persistent GPU Pods, autoscaling Serverless endpoints, network volumes, container templates, and a REST + GraphQL control plane for provisioning H100, H200, B200, A100, L40S, and consumer RTX GPUs. RunPod targets AI/ML developers who need flexible, per-second-billed GPU compute for training, fine-tuning, and inference workloads.

3 APIs 5 Features

AICloudComputeGPUInferenceMachine LearningServerless

RunPod publishes 1 API on the APIs.io network: REST API. Tagged areas include AI, Cloud, Compute, GPU, and Inference.

RunPod’s developer surface includes documentation, developer portal, signup flow, pricing, engineering blog, support, changelog, and 10 more developer resources.

APIs

RunPod REST API

The RunPod REST API programmatically manages Pods, Serverless endpoints, network volumes, templates, container registry auth, and billing. The API is the primary control plane f...

RunPod GraphQL API

The RunPod GraphQL API provides programmatic access to Pods, templates, and Serverless endpoints via GraphQL queries and mutations. It is the original control-plane interface an...

RunPod Serverless

RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based and load-balanced endpoint types, FlashBoot cold-start optimization, and per-s...

Features

GPU Pods

Persistent on-demand GPU instances with SSH, JupyterLab, and VSCode access, billed per-second across a wide range of NVIDIA SKUs.

Serverless Endpoints

Autoscaling, queue-based inference endpoints with FlashBoot cold-start optimization and pay-per-request billing.

Network Volumes

Persistent, portable storage that can be attached to Pods and Serverless workers across datacenters.

Templates

Reusable Pod and endpoint configurations bundling container images, hardware specs, and network settings.

vLLM Quick Deploy

Pre-built Serverless workers for deploying open-source LLMs with vLLM in a single click.

Integrations

Docker

Bring-your-own container support for any Docker image on Pods and Serverless workers.

Hugging Face

Direct deployment of Hugging Face models via vLLM Quick Deploy and ready-made templates.

Pulumi

Infrastructure-as-code provisioning of RunPod resources via the official Pulumi provider.

Resources

GitHubOrganization

Sources

aid: runpod
name: RunPod
description: RunPod is a managed GPU cloud and serverless inference platform offering on-demand and persistent
  GPU Pods, autoscaling Serverless endpoints, network volumes, container templates, and a REST + GraphQL control
  plane for provisioning H100, H200, B200, A100, L40S, and consumer RTX GPUs. RunPod targets AI/ML developers
  who need flexible, per-second-billed GPU compute for training, fine-tuning, and inference workloads.
image: https://kinlane-productions.s3.amazonaws.com/apis-json/apis-json-logo.jpg
url: https://raw.githubusercontent.com/api-evangelist/runpod/refs/heads/main/apis.yml
created: '2026-05-23'
modified: '2026-05-23'
specificationVersion: '0.19'
type: Index
access: 3rd-Party
position: Producer
tags:
- AI
- Cloud
- Compute
- GPU
- Inference
- Machine Learning
- Serverless
apis:
- aid: runpod:rest-api
  name: RunPod REST API
  description: The RunPod REST API programmatically manages Pods, Serverless endpoints, network volumes,
    templates, container registry auth, and billing. The API is the primary control plane for provisioning
    and operating GPU compute on RunPod.
  humanURL: https://docs.runpod.io/api-reference/overview
  baseURL: https://rest.runpod.io/v1
  tags:
  - Billing
  - Compute
  - GPU
  - Pods
  - REST
  - Serverless
  - Storage
  - Templates
  properties:
  - type: Documentation
    url: https://docs.runpod.io/api-reference/overview
  - type: OpenAPI
    url: https://rest.runpod.io/v1/openapi.json
- aid: runpod:graphql-api
  name: RunPod GraphQL API
  description: The RunPod GraphQL API provides programmatic access to Pods, templates, and Serverless endpoints
    via GraphQL queries and mutations. It is the original control-plane interface and is still supported
    alongside the REST API.
  humanURL: https://docs.runpod.io/sdks/graphql/configurations
  baseURL: https://api.runpod.io/graphql
  tags:
  - Compute
  - GPU
  - GraphQL
  - Pods
  - Serverless
  - Templates
  properties:
  - type: Documentation
    url: https://docs.runpod.io/sdks/graphql/configurations
- aid: runpod:serverless
  name: RunPod Serverless
  description: RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based
    and load-balanced endpoint types, FlashBoot cold-start optimization, and per-second billing. Each endpoint
    exposes a URL that accepts request payloads for AI model inference and compute-intensive workloads.
  humanURL: https://docs.runpod.io/serverless/overview
  baseURL: https://api.runpod.ai/v2
  tags:
  - AI
  - Autoscaling
  - GPU
  - Inference
  - Serverless
  - Workers
  properties:
  - type: Documentation
    url: https://docs.runpod.io/serverless/overview
common:
- type: Website
  url: https://runpod.io
- type: Developer
  url: https://docs.runpod.io
- type: Documentation
  url: https://docs.runpod.io
- type: Portal
  url: https://console.runpod.io
- type: SignUp
  url: https://www.runpod.io/console/signup
- type: Login
  url: https://www.runpod.io/console/signin
- type: Pricing
  url: https://www.runpod.io/pricing
- type: Blog
  url: https://blog.runpod.io
- type: StatusPage
  url: https://uptime.runpod.io
- type: TermsOfService
  url: https://www.runpod.io/legal/terms-of-service
- type: PrivacyPolicy
  url: https://www.runpod.io/legal/privacy-policy
- type: GitHubOrganization
  url: https://github.com/runpod
- type: Support
  url: https://www.runpod.io/contact
- type: ChangeLog
  url: https://docs.runpod.io/changelog
- name: RunPod Python SDK
  url: https://github.com/runpod/runpod-python
  type: SDK
- name: runpodctl CLI
  url: https://github.com/runpod/runpodctl
  type: CLI
- name: RunPod Pulumi Provider
  url: https://github.com/runpod/pulumi-runpod
  type: Terraform
- type: Features
  data:
  - name: GPU Pods
    description: Persistent on-demand GPU instances with SSH, JupyterLab, and VSCode access, billed per-second
      across a wide range of NVIDIA SKUs.
  - name: Serverless Endpoints
    description: Autoscaling, queue-based inference endpoints with FlashBoot cold-start optimization and
      pay-per-request billing.
  - name: Network Volumes
    description: Persistent, portable storage that can be attached to Pods and Serverless workers across
      datacenters.
  - name: Templates
    description: Reusable Pod and endpoint configurations bundling container images, hardware specs, and
      network settings.
  - name: vLLM Quick Deploy
    description: Pre-built Serverless workers for deploying open-source LLMs with vLLM in a single click.
- type: Integrations
  data:
  - name: Docker
    description: Bring-your-own container support for any Docker image on Pods and Serverless workers.
  - name: Hugging Face
    description: Direct deployment of Hugging Face models via vLLM Quick Deploy and ready-made templates.
  - name: Pulumi
    description: Infrastructure-as-code provisioning of RunPod resources via the official Pulumi provider.
- type: GPUs
  data:
  - name: NVIDIA B200
    description: Blackwell-generation flagship GPU, listed at $5.98/hr.
  - name: NVIDIA H200
    description: 141GB HBM3e GPU, listed at $3.59/hr.
  - name: NVIDIA H100 SXM
    description: 80GB Hopper GPU on SXM, listed at $2.69/hr.
  - name: NVIDIA H100 PCIe
    description: 80GB Hopper GPU on PCIe, listed at $1.99/hr.
  - name: NVIDIA A100 SXM
    description: 80GB Ampere GPU on SXM, listed at $1.39/hr.
  - name: NVIDIA A100 PCIe
    description: 80GB Ampere GPU on PCIe, listed at $1.19/hr.
  - name: NVIDIA L40S
    description: 48GB Ada Lovelace inference GPU, listed at $0.79/hr.
  - name: NVIDIA RTX 4090
    description: 24GB consumer GPU, listed at $0.34/hr.
maintainers:
- FN: Kin Lane
  email: [email protected]
  url: https://apievangelist.com