LiteLLM
LiteLLM is an open-source Python SDK and proxy server providing a unified OpenAI-compatible interface to 100+ LLM providers.
APIs
LiteLLM Chat Completions API
Provides an OpenAI-compatible /chat/completions endpoint that routes requests to 100+ LLM providers with unified request and response formatting, streaming support, cost trackin...
LiteLLM Completions API
Provides an OpenAI-compatible /completions endpoint for text completion requests routed through the LiteLLM proxy to supported LLM providers.
LiteLLM Responses API
Provides an OpenAI-compatible /responses endpoint supporting the Responses API specification, including conversation history compression via /responses/compact.
LiteLLM Embeddings API
Provides an OpenAI-compatible /embeddings endpoint for generating text embeddings across multiple providers including OpenAI, Cohere, HuggingFace, and Bedrock with unified forma...
LiteLLM Image Generation API
Provides OpenAI-compatible /images/generations, /images/edits, and /images/variations endpoints for image generation and manipulation routed through the LiteLLM proxy.
LiteLLM Audio API
Provides OpenAI-compatible /audio/transcriptions and /audio/speech endpoints for audio transcription and text-to-speech conversion across supported providers.
LiteLLM Moderations API
Provides an OpenAI-compatible /moderations endpoint for content moderation across supported providers through the LiteLLM proxy.
LiteLLM Batches API
Provides an OpenAI-compatible /batches endpoint for batch processing operations, enabling bulk request handling across LLM providers.
LiteLLM Files API
Provides an OpenAI-compatible /files endpoint for file management operations used in conjunction with fine-tuning and batch processing.
LiteLLM Fine-Tuning API
Provides an OpenAI-compatible /fine_tuning endpoint for model fine-tuning operations across supported providers through the LiteLLM proxy.
LiteLLM Rerank API
Provides a /rerank endpoint for document reranking operations, supporting providers like Cohere through the LiteLLM proxy with a unified interface.
LiteLLM Vector Stores API
Provides /vector_stores endpoints for creating and managing vector stores, file operations within vector stores, and search functionality for retrieval-augmented generation (RAG...
LiteLLM Anthropic Messages API
Provides Anthropic-compatible /v1/messages and /v1/messages/count_tokens endpoints for native Anthropic API format support through the LiteLLM proxy.
LiteLLM Realtime API
Provides /realtime WebSocket endpoints for real-time model interactions with load balancing and guardrails support across providers.
LiteLLM MCP API
Provides /mcp endpoints for Model Context Protocol (MCP) integration, enabling LLMs to interact with external tools and APIs through OpenAPI specifications.
LiteLLM OCR API
Provides an /ocr endpoint for optical character recognition, enabling text extraction from images through supported providers via the LiteLLM proxy.
LiteLLM Guardrails API
Provides /guardrails/apply_guardrail endpoint for applying configured content filtering and safety guardrails to LLM requests and responses.
LiteLLM Evals API
Provides /evals endpoints for the Evaluations API, enabling measurement and benchmarking of model performance through the LiteLLM proxy.
LiteLLM A2A Agent Gateway API
Provides /a2a endpoints for the Agent-to-Agent (A2A) gateway, enabling agent registration, publishing, and inter-agent communication.
LiteLLM Videos API
Provides /videos endpoints for video generation and handling through supported providers like RunwayML via the LiteLLM proxy.