Components Overview¶
The GenAI on EKS Starter Kit provides a curated collection of production-ready components for building and deploying GenAI applications on Kubernetes. All components are installed via the CLI: ./cli <category> <component> install.
Component Categories¶
| Category | Description | Components |
|---|---|---|
| NVIDIA Platform | NVIDIA Dynamo LLM serving platform with GPU acceleration, monitoring, and benchmarking | GPU Operator, Monitoring, Dynamo Platform, Dynamo vLLM, AIPerf Benchmark, AIConfigurator |
| AI Gateway | API gateways and proxies for LLM routing, load balancing, and rate limiting | LiteLLM, Kong |
| LLM Model | Large Language Model serving engines | vLLM, SGLang, TGI, Ollama |
| Embedding Model | Text embedding generation services | TEI (Text Embeddings Inference) |
| Guardrail | Content safety and policy enforcement | Guardrails AI |
| Observability | Monitoring, tracing, and analytics for GenAI applications | Langfuse, MLflow, Phoenix |
| GUI Application | User interfaces for interacting with LLMs | Open WebUI |
| Vector Database | High-performance vector storage for RAG applications | Qdrant, Chroma, Milvus |
| Workflow Automation | Visual workflow builders and automation platforms | n8n |
| AI Agent | Agent orchestration and task routing systems | OpenClaw |
Quick Start¶
# Install a component
./cli <category> <component> install
# Example: Install vLLM
./cli llm-model vllm install
# Uninstall a component
./cli <category> <component> uninstall
Component Index¶
NVIDIA Platform¶
- Overview - NVIDIA Dynamo Platform architecture and installation guide
- GPU Operator - NVIDIA GPU resource management
- Monitoring - Prometheus + Grafana with Dynamo dashboards
- Dynamo Platform - CRDs, Operator, etcd, NATS
- Dynamo vLLM - Aggregated/disaggregated vLLM serving
- AIPerf Benchmark - Comprehensive LLM benchmarking suite
- AIConfigurator - Auto TP/PP recommendation and SLA-driven deployment
AI Gateway¶
- LiteLLM - Universal LLM gateway with unified API
- Kong - API gateway with rate limiting and authentication
LLM Model¶
- vLLM - High-throughput LLM serving with PagedAttention
- SGLang - Efficient LLM serving with RadixAttention
- TGI - HuggingFace Text Generation Inference
- Ollama - Local LLM serving made simple
Embedding Model¶
- TEI - Text Embeddings Inference for vector generation
Guardrail¶
- Guardrails AI - LLM output validation and content safety
Observability¶
- Langfuse - LLM observability and analytics
- MLflow - ML experiment tracking and model registry
- Phoenix - LLM observability and tracing
GUI Application¶
- Open WebUI - ChatGPT-style web interface for LLMs
Vector Database¶
- Qdrant - High-performance vector search engine
- Chroma - Open-source embedding database
- Milvus - Cloud-native vector database
Workflow Automation¶
- n8n - Visual workflow automation platform
AI Agent¶
- OpenClaw - AI agent orchestration bridge server
Configuration¶
Components are configured via:
- config.json - Component-specific settings
- .env - Environment variables and credentials
- config.local.json - Local overrides (gitignored)
- .env.local - Local environment overrides (gitignored)
Configuration hierarchy (later sources override earlier ones):
Architecture Patterns¶
Typical GenAI Stack¶
┌─────────────────────────────────────────────────────────────┐
│ Open WebUI (Frontend) │
└──────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────────┐
│ LiteLLM (AI Gateway + Router) │
└──┬────────────────┬────────────────┬────────────────────┬───┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────────┐ ┌──────────────┐
│ vLLM │ │ SGLang │ │ AWS Bedrock │ │ Guardrails AI│
└─────────┘ └─────────┘ └─────────────┘ └──────────────┘
│ │
▼ ▼
┌──────────────────────────────┐
│ Langfuse (Observability) │
└──────────────────────────────┘
RAG Application Stack¶
┌─────────────────────────────────────────────────────────────┐
│ Application Layer │
└──────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────────┐
│ LiteLLM Gateway │
└──┬────────────────────────────────────────────────────┬─────┘
│ │
▼ ▼
┌─────────────────┐ ┌──────────────────┐
│ LLM (vLLM/SGLang)│ │ TEI (Embedding) │
└─────────────────┘ └────────┬─────────┘
│
▼
┌──────────────────┐
│ Vector DB (Qdrant)│
└──────────────────┘