AI Gateway logo

AI Gateway

An API Evangelist landscape index of AI gateways — the LLM routers, prompt firewalls, model fallback proxies, cost-control planes, and policy engines that sit between applications and AI providers. AI gateways unify access across OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, and self-hosted models behind a common interface and apply caching, routing, guardrails, observability, rate limiting, budgets, RBAC, and audit controls. This index catalogs commercial SaaS gateways, open-source projects, API gateway AI plugins, and cloud-provider AI proxies, with a shared schema and vocabulary for describing model routes, fallbacks, guardrails, and budgets across vendors.

15 APIs 13 Features
AI GatewayLLM RouterLLM ProxyModel RoutingPrompt FirewallGuardrailsAI ObservabilityCost ControlsAI GovernanceAPI Gateway

APIs

Portkey

Portkey is a production-grade AI gateway and control plane that fronts 1,600+ LLMs with unified routing, fallbacks, semantic caching, guardrails, cost attribution, and prompt ma...

OpenRouter

OpenRouter is a unified inference marketplace exposing 400+ models from 60+ providers behind one OpenAI-compatible API, with automatic provider fallback, pay-as-you-go credits, ...

LiteLLM

LiteLLM (BerriAI) is an open-source LLM gateway that exposes 100+ LLM providers — OpenAI, Anthropic, Azure, Bedrock, Gemini — through a single OpenAI-compatible API. The LiteLLM...

Helicone

Helicone is an open-source AI observability and routing platform centered on requests, sessions, prompts, datasets, rate limits, and alerts. Integrates with OpenAI, Anthropic, G...

Cloudflare AI Gateway

Cloudflare AI Gateway is an edge-deployed proxy that fronts AI providers — Workers AI, Anthropic, Google Gemini, OpenAI, Replicate, and more — with caching, rate limiting, analy...

Kong AI Gateway

The Kong AI Gateway is delivered as the AI Proxy plugin for Kong Gateway, transforming and proxying requests across 16+ providers including OpenAI, Azure OpenAI, Anthropic, Amaz...

Apache APISIX AI Proxy

The Apache APISIX ai-proxy plugin streamlines integration with LLMs by converting plugin settings into the appropriate request format for OpenAI, DeepSeek, Azure OpenAI, Anthrop...

Tetrate Agent Router Service

Tetrate Agent Router Service is an Envoy AI Gateway-as-a-service from the creators of Envoy, providing an approved LLM catalog, unified model access, automatic fallback, cost ma...

NVIDIA NIM

NVIDIA NIM is a set of inference microservices for streamlined AI model deployment, prebuilt and optimized for low-latency, high-throughput inference on NVIDIA-accelerated infra...

Traefik AI Gateway

Traefik AI Gateway is an enterprise, self-hosted, Kubernetes-native AI gateway with safety and governance (NVIDIA Safety NIMs, jailbreak detection, content filtering across 22+ ...

Together AI

Together AI is a full-stack AI Native Cloud for inference, fine-tuning, and GPU clusters powered by research, exposing serverless inference, batch processing, dedicated model an...

Anyscale

Anyscale is the production-scale AI platform built on Ray by the creators of Ray, supporting LLM inference and other data-intensive AI workloads across distributed GPU clusters....

LangDB

LangDB is an enterprise AI gateway for routing and governing LLM traffic across providers, with observability, cost tracking, and policy enforcement. Public homepage was unreach...

Envoy AI Gateway

Envoy AI Gateway is an open-source extension to Envoy Proxy and Envoy Gateway, providing a Kubernetes-native AI traffic plane for routing, governing, and observing LLM calls acr...

Gentrace

Gentrace was an AI evaluation and observability product; the company has shut down and its codebase is now MIT-licensed open source on GitHub. Included here for historical compl...

Features

Provider Abstraction

A unified, typically OpenAI-compatible API surface that lets clients call any supported LLM provider without provider-specific SDK juggling.

Model Routing

Route requests to the right model and provider based on alias, header, request content, identity, time-of-day, cost, or latency.

Fallback and Failover

Automatically retry failed requests against backup providers or models when a primary upstream is degraded, rate-limited, or down.

Load Balancing and Fanout

Distribute traffic across multiple providers or replicas using weighted, priority-based, or RPM/TPM-aware load balancing.

Response Caching

Exact-match and semantic caching of model responses to cut latency and provider spend; some gateways claim 40-70 percent cost savings.

Cost Controls and Budgets

Per-user, per-team, per-key, per-project budgets, spend tracking, and hard or soft caps on token consumption.

Rate Limiting and Quotas

RPM, TPM, concurrency, and per-key quotas enforced at the gateway, decoupled from each upstream provider's limits.

Guardrails and Prompt Firewall

Prompt injection detection, jailbreak filtering, content moderation, PII redaction, and topic control applied to requests and responses.

Observability

Request, response, token, cost, latency, error, and trace data exported via OpenTelemetry, Langfuse, Phoenix, Langsmith, or built-in dashboards.

Authentication and RBAC

Virtual keys, JWT, OAuth2, SSO, and role-based access control over which clients can use which models with which budgets.

BYOK and Secret Management

Bring-your-own provider API keys, with the gateway holding and injecting them so clients never see upstream credentials.

Multi-Tenant Governance

Per-tenant isolation of keys, budgets, logs, and policies for platform teams serving multiple internal product teams.

MCP Federation

Some AI gateways also front Model Context Protocol servers, aggregating tools and exposing a single MCP endpoint to agents.

Use Cases

Provider-Agnostic LLM Access

Front many LLM providers behind one API so application teams can switch models without changing client code.

Cost Containment for AI

Apply caching, routing to cheaper models, and per-team budgets to keep generative-AI spend predictable.

Reliability and Failover

Survive single-provider outages by automatically failing over to backup models when the primary degrades.

Centralized AI Governance

Enforce content, PII, and policy controls in one place for every AI request leaving the organization.

Observability and FinOps

Attribute cost and latency to teams, projects, and users; expose token-level metrics to FinOps and SRE.

Multi-Tenant AI Platforms

Build internal AI platforms where each product team gets its own virtual keys, budgets, and logs.

Integrations

OpenAI

Front OpenAI's GPT, embeddings, and image models behind the gateway with virtual keys and budgets.

Anthropic

Route Claude requests through the gateway for fallback, caching, and central observability.

Google Gemini and Vertex AI

Proxy Google Gemini and Vertex AI calls with OpenAI-format translation where supported.

AWS Bedrock

Bridge OpenAI-format clients to Bedrock-hosted Anthropic, Mistral, Cohere, Meta, and Amazon models.

Azure OpenAI

Route to Azure-hosted OpenAI deployments with per-region failover and key rotation.

Ollama and vLLM

Front self-hosted Ollama and vLLM inference servers for hybrid cloud and on-prem inference.

OpenTelemetry

Export request, token, cost, and trace data to any OTel-compatible observability backend.

Langfuse and Phoenix

Stream prompts, completions, and evaluations to Langfuse and Arize Phoenix for prompt and model analytics.

Model Context Protocol

Some AI gateways federate MCP servers alongside LLM routes, exposing a unified agent endpoint.

Semantic Vocabularies

Ai Gateway Context

9 classes · 70 properties

JSON-LD

Resources

🔗
AI Gateway Route Schema
JSONSchema
🔗
AI Gateway Provider Schema
JSONSchema
🔗
AI Gateway Policy Schema
JSONSchema
🔗
AI Gateway Route Structure
JSONStructure
🔗
AI Gateway Provider Structure
JSONStructure
🔗
AI Gateway Policy Structure
JSONStructure
🔗
JSONLD
JSONLD
🔗
Vocabulary
Vocabulary
💻
Examples
Examples
🌐
Portal
Portal
📰
Blog
Blog

Sources

apis.yml Raw ↑
aid: ai-gateway
name: AI Gateway
description: An API Evangelist landscape index of AI gateways — the LLM routers, prompt firewalls, model fallback proxies, cost-control planes, and policy engines that sit between applications and AI
  providers. AI gateways unify access across OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, and self-hosted models behind a common interface and apply caching, routing, guardrails, observability,
  rate limiting, budgets, RBAC, and audit controls. This index catalogs commercial SaaS gateways, open-source projects, API gateway AI plugins, and cloud-provider AI proxies, with a shared schema and
  vocabulary for describing model routes, fallbacks, guardrails, and budgets across vendors.
url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/apis.yml
humanURL: https://github.com/api-evangelist/ai-gateway
type: Index
position: Consuming
access: 3rd-Party
image: https://kinlane-productions2.s3.amazonaws.com/apis-json/apis-json-logo.jpg
tags:
- AI Gateway
- LLM Router
- LLM Proxy
- Model Routing
- Prompt Firewall
- Guardrails
- AI Observability
- Cost Controls
- AI Governance
- API Gateway
created: '2026-05-22'
modified: '2026-05-22'
specificationVersion: '0.19'
apis:
- aid: ai-gateway:portkey
  name: Portkey
  description: Portkey is a production-grade AI gateway and control plane that fronts 1,600+ LLMs with unified routing, fallbacks, semantic caching, guardrails, cost attribution, and prompt management.
    The open-source Portkey Gateway is MIT-licensed; a hosted SaaS adds governance, observability, and enterprise controls.
  humanURL: https://portkey.ai/
  baseURL: https://api.portkey.ai
  tags:
  - AI Gateway
  - LLM Router
  - Guardrails
  - Observability
  - Prompt Management
  - Open Source
  properties:
  - type: Portal
    url: https://portkey.ai/
  - type: Documentation
    url: https://portkey.ai/docs/
  - type: GitHubRepository
    url: https://github.com/Portkey-AI/gateway
  - type: GitHubOrganization
    url: https://github.com/Portkey-AI
  x-deployment:
  - cloud
  - self-host
  - opensource
  x-license: MIT
- aid: ai-gateway:openrouter
  name: OpenRouter
  description: OpenRouter is a unified inference marketplace exposing 400+ models from 60+ providers behind one OpenAI-compatible API, with automatic provider fallback, pay-as-you-go credits, custom data
    policies, and edge-routed latency optimization. It is a proprietary SaaS service.
  humanURL: https://openrouter.ai/
  baseURL: https://openrouter.ai/api/v1
  tags:
  - AI Gateway
  - LLM Marketplace
  - Multi-Provider
  - Fallback
  - Proprietary
  properties:
  - type: Portal
    url: https://openrouter.ai/
  - type: Documentation
    url: https://openrouter.ai/docs
  - type: Models
    url: https://openrouter.ai/models
  x-deployment:
  - cloud
  x-license: Proprietary
- aid: ai-gateway:litellm
  name: LiteLLM
  description: LiteLLM (BerriAI) is an open-source LLM gateway that exposes 100+ LLM providers — OpenAI, Anthropic, Azure, Bedrock, Gemini — through a single OpenAI-compatible API. The LiteLLM Proxy adds
    virtual keys, load balancing, RPM/TPM limits, spend tracking, and observability hooks for Langfuse, Phoenix, Langsmith, and OpenTelemetry. Self-hostable via Docker; enterprise support available.
  humanURL: https://www.litellm.ai/
  baseURL: https://api.litellm.ai
  tags:
  - AI Gateway
  - LLM Proxy
  - Open Source
  - Cost Tracking
  - Load Balancing
  properties:
  - type: Portal
    url: https://www.litellm.ai/
  - type: Documentation
    url: https://docs.litellm.ai/
  - type: GitHubRepository
    url: https://github.com/BerriAI/litellm
  - type: PyPI
    url: https://pypi.org/project/litellm/
  x-deployment:
  - self-host
  - opensource
  - cloud
  x-license: MIT
- aid: ai-gateway:helicone
  name: Helicone
  description: Helicone is an open-source AI observability and routing platform centered on requests, sessions, prompts, datasets, rate limits, and alerts. Integrates with OpenAI, Anthropic, Google
    Gemini, DeepSeek, Together AI, Mistral, Groq, Azure, OpenRouter, and LiteLLM. Available as managed cloud or self-hosted.
  humanURL: https://www.helicone.ai/
  baseURL: https://api.helicone.ai
  tags:
  - AI Gateway
  - Observability
  - Prompt Management
  - Open Source
  - Caching
  properties:
  - type: Portal
    url: https://www.helicone.ai/
  - type: Documentation
    url: https://docs.helicone.ai/
  - type: GitHubRepository
    url: https://github.com/Helicone/helicone
  x-deployment:
  - cloud
  - self-host
  - opensource
- aid: ai-gateway:cloudflare-ai-gateway
  name: Cloudflare AI Gateway
  description: Cloudflare AI Gateway is an edge-deployed proxy that fronts AI providers — Workers AI, Anthropic, Google Gemini, OpenAI, Replicate, and more — with caching, rate limiting, analytics, and
    request logging. Available on all Cloudflare plans.
  humanURL: https://developers.cloudflare.com/ai-gateway/
  baseURL: https://gateway.ai.cloudflare.com
  tags:
  - AI Gateway
  - Edge
  - Caching
  - Rate Limiting
  - Analytics
  properties:
  - type: Portal
    url: https://www.cloudflare.com/developer-platform/ai-gateway/
  - type: Documentation
    url: https://developers.cloudflare.com/ai-gateway/
  - type: GettingStarted
    url: https://developers.cloudflare.com/ai-gateway/get-started/
  x-deployment:
  - cloud
  x-license: Proprietary
- aid: ai-gateway:kong-ai-gateway
  name: Kong AI Gateway
  description: The Kong AI Gateway is delivered as the AI Proxy plugin for Kong Gateway, transforming and proxying requests across 16+ providers including OpenAI, Azure OpenAI, Anthropic, Amazon Bedrock,
    Gemini, Vertex AI, Cohere, Mistral, Hugging Face, Llama, xAI, Ollama, Alibaba DashScope, Cerebras, DeepSeek, Databricks, and vLLM. Supports chat, completions, embeddings, assistants, audio, image,
    video, batches, and files routes with template-based model selection.
  humanURL: https://konghq.com/products/kong-ai-gateway
  baseURL: https://konghq.com
  tags:
  - AI Gateway
  - API Gateway
  - Multi-Provider
  - Plugin
  - Kong
  properties:
  - type: Portal
    url: https://konghq.com/products/kong-ai-gateway
  - type: Documentation
    url: https://developer.konghq.com/plugins/ai-proxy/
  - type: GitHubOrganization
    url: https://github.com/Kong
  x-deployment:
  - cloud
  - self-host
  - opensource
- aid: ai-gateway:apisix-ai-proxy
  name: Apache APISIX AI Proxy
  description: The Apache APISIX ai-proxy plugin streamlines integration with LLMs by converting plugin settings into the appropriate request format for OpenAI, DeepSeek, Azure OpenAI, Anthropic, Google
    Gemini, Vertex AI, OpenRouter, AIMLAPI, and OpenAI-compatible services. Supports embedding models, observability of token usage and latency, custom endpoints, and flexible authentication. Apache
    2.0 licensed.
  humanURL: https://apisix.apache.org/
  baseURL: https://apisix.apache.org
  tags:
  - AI Gateway
  - API Gateway
  - Open Source
  - Apache
  - Plugin
  properties:
  - type: Portal
    url: https://apisix.apache.org/
  - type: Documentation
    url: https://apisix.apache.org/docs/apisix/plugins/ai-proxy/
  - type: GitHubRepository
    url: https://github.com/apache/apisix
  x-deployment:
  - self-host
  - opensource
  x-license: Apache-2.0
- aid: ai-gateway:tetrate-agent-router
  name: Tetrate Agent Router Service
  description: Tetrate Agent Router Service is an Envoy AI Gateway-as-a-service from the creators of Envoy, providing an approved LLM catalog, unified model access, automatic fallback, cost management,
    AI guardrails, and an MCP gateway for agent tool connectivity. Built on Envoy AI Gateway.
  humanURL: https://tetrate.io/products/tetrate-agent-router-service/
  baseURL: https://tetrate.io
  tags:
  - AI Gateway
  - Envoy
  - MCP Gateway
  - Guardrails
  - Multi-Provider
  properties:
  - type: Portal
    url: https://tetrate.io/products/tetrate-agent-router-service/
  - type: Documentation
    url: https://docs.tetrate.io/
  - type: GitHubOrganization
    url: https://github.com/envoyproxy
  - type: GitHubRepository
    url: https://github.com/envoyproxy/ai-gateway
  x-deployment:
  - cloud
  - self-host
  - opensource
- aid: ai-gateway:nvidia-nim
  name: NVIDIA NIM
  description: NVIDIA NIM is a set of inference microservices for streamlined AI model deployment, prebuilt and optimized for low-latency, high-throughput inference on NVIDIA-accelerated infrastructure.
    Includes TensorRT and TensorRT-LLM-backed engines and exposes stable OpenAI-compatible APIs for self-hosted and cloud deployment.
  humanURL: https://www.nvidia.com/en-us/ai/
  baseURL: https://build.nvidia.com
  tags:
  - AI Gateway
  - Inference
  - Self-Hosted
  - NVIDIA
  - GPU
  properties:
  - type: Portal
    url: https://build.nvidia.com/
  - type: Documentation
    url: https://docs.nvidia.com/nim/
  - type: GitHubOrganization
    url: https://github.com/NVIDIA
  x-deployment:
  - self-host
  - cloud
  x-license: Proprietary
- aid: ai-gateway:traefik-ai-gateway
  name: Traefik AI Gateway
  description: Traefik AI Gateway is an enterprise, self-hosted, Kubernetes-native AI gateway with safety and governance (NVIDIA Safety NIMs, jailbreak detection, content filtering across 22+
    categories), multi-LLM support via an OpenAI-compatible interface (Anthropic, Azure OpenAI, AWS Bedrock, Cohere, Gemini, Mistral, Ollama), intelligent routing, credential management, semantic
    caching with claimed 40-70 percent cost savings, PII protection via Presidio (35+ recognizers), and OpenTelemetry observability.
  humanURL: https://traefik.io/solutions/ai-gateway/
  baseURL: https://traefik.io
  tags:
  - AI Gateway
  - Kubernetes
  - Guardrails
  - Semantic Caching
  - PII Protection
  properties:
  - type: Portal
    url: https://traefik.io/solutions/ai-gateway/
  - type: Documentation
    url: https://doc.traefik.io/
  - type: GitHubOrganization
    url: https://github.com/traefik
  x-deployment:
  - self-host
  - cloud
- aid: ai-gateway:together-ai
  name: Together AI
  description: Together AI is a full-stack AI Native Cloud for inference, fine-tuning, and GPU clusters powered by research, exposing serverless inference, batch processing, dedicated model and container
    inference, GPU clusters, fine-tuning, managed storage, and code sandboxes for open-source models.
  humanURL: https://www.together.ai/
  baseURL: https://api.together.xyz
  tags:
  - Inference
  - Open Models
  - GPU
  - Multi-Provider
  - SaaS
  properties:
  - type: Portal
    url: https://www.together.ai/
  - type: Documentation
    url: https://docs.together.ai/
  - type: GitHubOrganization
    url: https://github.com/togethercomputer
  x-deployment:
  - cloud
  x-license: Proprietary
- aid: ai-gateway:anyscale
  name: Anyscale
  description: Anyscale is the production-scale AI platform built on Ray by the creators of Ray, supporting LLM inference and other data-intensive AI workloads across distributed GPU clusters. Integrates
    with vLLM and SkyRL; users bring their own models.
  humanURL: https://www.anyscale.com/
  baseURL: https://api.endpoints.anyscale.com
  tags:
  - Inference
  - Ray
  - GPU
  - Open Source
  - Self-Hosted
  properties:
  - type: Portal
    url: https://www.anyscale.com/
  - type: Documentation
    url: https://docs.anyscale.com/
  - type: GitHubOrganization
    url: https://github.com/anyscale
  - type: GitHubRepository
    url: https://github.com/ray-project/ray
  x-deployment:
  - cloud
  - self-host
- aid: ai-gateway:langdb
  name: LangDB
  description: LangDB is an enterprise AI gateway for routing and governing LLM traffic across providers, with observability, cost tracking, and policy enforcement. Public homepage was unreachable for
    direct verification during this profiling pass; see GitHub for current capabilities.
  humanURL: https://www.langdb.ai/
  baseURL: https://api.langdb.ai
  tags:
  - AI Gateway
  - LLM Router
  - Observability
  - Cost Tracking
  properties:
  - type: Portal
    url: https://www.langdb.ai/
  - type: GitHubOrganization
    url: https://github.com/langdb
  x-deployment:
  - cloud
  - self-host
- aid: ai-gateway:envoy-ai-gateway
  name: Envoy AI Gateway
  description: Envoy AI Gateway is an open-source extension to Envoy Proxy and Envoy Gateway, providing a Kubernetes-native AI traffic plane for routing, governing, and observing LLM calls across
    providers. Apache 2.0 licensed and CNCF-aligned.
  humanURL: https://aigateway.envoyproxy.io/
  baseURL: https://aigateway.envoyproxy.io
  tags:
  - AI Gateway
  - Envoy
  - Kubernetes
  - CNCF
  - Open Source
  properties:
  - type: Portal
    url: https://aigateway.envoyproxy.io/
  - type: Documentation
    url: https://aigateway.envoyproxy.io/docs/
  - type: GitHubRepository
    url: https://github.com/envoyproxy/ai-gateway
  - type: GitHubOrganization
    url: https://github.com/envoyproxy
  x-deployment:
  - self-host
  - opensource
  x-license: Apache-2.0
- aid: ai-gateway:gentrace
  name: Gentrace
  description: Gentrace was an AI evaluation and observability product; the company has shut down and its codebase is now MIT-licensed open source on GitHub. Included here for historical completeness in
    the AI gateway-adjacent observability category.
  humanURL: https://github.com/gentrace/gentrace
  tags:
  - AI Observability
  - Open Source
  - Archived
  - Evaluation
  properties:
  - type: GitHubRepository
    url: https://github.com/gentrace/gentrace
  x-deployment:
  - opensource
  x-license: MIT
  x-status: archived
common:
- type: JSONSchema
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-schema/ai-gateway-route-schema.json
  title: AI Gateway Route Schema
- type: JSONSchema
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-schema/ai-gateway-provider-schema.json
  title: AI Gateway Provider Schema
- type: JSONSchema
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-schema/ai-gateway-policy-schema.json
  title: AI Gateway Policy Schema
- type: JSONStructure
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-structure/ai-gateway-route-structure.json
  title: AI Gateway Route Structure
- type: JSONStructure
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-structure/ai-gateway-provider-structure.json
  title: AI Gateway Provider Structure
- type: JSONStructure
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-structure/ai-gateway-policy-structure.json
  title: AI Gateway Policy Structure
- type: JSONLD
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/json-ld/ai-gateway-context.jsonld
- type: Vocabulary
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/vocabulary/ai-gateway-vocabulary.yml
- type: Examples
  url: https://raw.githubusercontent.com/api-evangelist/ai-gateway/refs/heads/main/examples/
- type: Features
  data:
  - name: Provider Abstraction
    description: A unified, typically OpenAI-compatible API surface that lets clients call any supported LLM provider without provider-specific SDK juggling.
  - name: Model Routing
    description: Route requests to the right model and provider based on alias, header, request content, identity, time-of-day, cost, or latency.
  - name: Fallback and Failover
    description: Automatically retry failed requests against backup providers or models when a primary upstream is degraded, rate-limited, or down.
  - name: Load Balancing and Fanout
    description: Distribute traffic across multiple providers or replicas using weighted, priority-based, or RPM/TPM-aware load balancing.
  - name: Response Caching
    description: Exact-match and semantic caching of model responses to cut latency and provider spend; some gateways claim 40-70 percent cost savings.
  - name: Cost Controls and Budgets
    description: Per-user, per-team, per-key, per-project budgets, spend tracking, and hard or soft caps on token consumption.
  - name: Rate Limiting and Quotas
    description: RPM, TPM, concurrency, and per-key quotas enforced at the gateway, decoupled from each upstream provider's limits.
  - name: Guardrails and Prompt Firewall
    description: Prompt injection detection, jailbreak filtering, content moderation, PII redaction, and topic control applied to requests and responses.
  - name: Observability
    description: Request, response, token, cost, latency, error, and trace data exported via OpenTelemetry, Langfuse, Phoenix, Langsmith, or built-in dashboards.
  - name: Authentication and RBAC
    description: Virtual keys, JWT, OAuth2, SSO, and role-based access control over which clients can use which models with which budgets.
  - name: BYOK and Secret Management
    description: Bring-your-own provider API keys, with the gateway holding and injecting them so clients never see upstream credentials.
  - name: Multi-Tenant Governance
    description: Per-tenant isolation of keys, budgets, logs, and policies for platform teams serving multiple internal product teams.
  - name: MCP Federation
    description: Some AI gateways also front Model Context Protocol servers, aggregating tools and exposing a single MCP endpoint to agents.
- type: UseCases
  data:
  - name: Provider-Agnostic LLM Access
    description: Front many LLM providers behind one API so application teams can switch models without changing client code.
  - name: Cost Containment for AI
    description: Apply caching, routing to cheaper models, and per-team budgets to keep generative-AI spend predictable.
  - name: Reliability and Failover
    description: Survive single-provider outages by automatically failing over to backup models when the primary degrades.
  - name: Centralized AI Governance
    description: Enforce content, PII, and policy controls in one place for every AI request leaving the organization.
  - name: Observability and FinOps
    description: Attribute cost and latency to teams, projects, and users; expose token-level metrics to FinOps and SRE.
  - name: Multi-Tenant AI Platforms
    description: Build internal AI platforms where each product team gets its own virtual keys, budgets, and logs.
- type: Integrations
  data:
  - name: OpenAI
    description: Front OpenAI's GPT, embeddings, and image models behind the gateway with virtual keys and budgets.
  - name: Anthropic
    description: Route Claude requests through the gateway for fallback, caching, and central observability.
  - name: Google Gemini and Vertex AI
    description: Proxy Google Gemini and Vertex AI calls with OpenAI-format translation where supported.
  - name: AWS Bedrock
    description: Bridge OpenAI-format clients to Bedrock-hosted Anthropic, Mistral, Cohere, Meta, and Amazon models.
  - name: Azure OpenAI
    description: Route to Azure-hosted OpenAI deployments with per-region failover and key rotation.
  - name: Ollama and vLLM
    description: Front self-hosted Ollama and vLLM inference servers for hybrid cloud and on-prem inference.
  - name: OpenTelemetry
    description: Export request, token, cost, and trace data to any OTel-compatible observability backend.
  - name: Langfuse and Phoenix
    description: Stream prompts, completions, and evaluations to Langfuse and Arize Phoenix for prompt and model analytics.
  - name: Model Context Protocol
    description: Some AI gateways federate MCP servers alongside LLM routes, exposing a unified agent endpoint.
- type: Portal
  url: https://github.com/api-evangelist/ai-gateway
- type: Blog
  url: https://apievangelist.com/category/ai-gateway/
maintainers:
- FN: Kin Lane
  email: [email protected]