Each example is a self-contained deployable pattern with its own README explaining the "why" alongside the "how." Deploy the base cluster once, then apply individual examples to explore.
Compute Patterns
| Example | Description |
|---|
| Graviton | ARM64 workloads on cost-effective Graviton instances |
| Spot | Fault-tolerant workloads on EC2 Spot with diverse instance families |
| GPU | GPU-accelerated ML inference (Qwen 3 on NVIDIA GPUs) |
| Neuron | ML inference on AWS Inferentia2 (DeepSeek-R1 served by vLLM) |
Cost Optimization
| Example | Description |
|---|
| Cost Optimization | OD/Spot mixed pools with weighted priorities and pause-pod overprovision |
Advanced Scheduling
Autoscaling
| Example | Description |
|---|
| Pod Autoscaling | HPA for CPU-based scaling + KEDA for event-driven scaling |
Observability
| Example | Description |
|---|
| Observability | CloudWatch Container Insights integration |