KAITO
KAITO (Kubernetes AI Toolchain Operator) is an open-source operator suite that automates LLM model inference, fine-tuning, and Retrieval Augmented Generation (RAG) engine deployment in Kubernetes clusters. It simplifies the process of deploying large AI models through optimized preset configurations and integrates with Karpenter for GPU node auto-provisioning.
1 APIs
0 Features
AIGPUInferenceKubernetesLLMMachine LearningOpen SourceOperatorRAG
APIs
KAITO RAGEngine API
RAGEngine exposes endpoints for managing retrieval-augmented generation services with embedded vector databases, including document indexing, retrieval, and chat completion endp...
Resources
🔗
Website
Website
🔗
Documentation
Documentation
🔗
Installation
Installation
🚀
Getting Started
Getting Started
👥
GitHub Organization
GitHub Organization
💻
Source Code
Source Code