Skip to main content

Examples

Each example is a self-contained deployable pattern with its own README explaining the "why" alongside the "how." Deploy the base cluster once, then apply individual examples to explore.

Compute Patterns

ExampleDescription
GravitonARM64 workloads on cost-effective Graviton instances
SpotFault-tolerant workloads on EC2 Spot with diverse instance families
GPUGPU-accelerated ML inference (Qwen 3 on NVIDIA GPUs)
NeuronML inference on AWS Inferentia2 (DeepSeek-R1 served by vLLM)

Cost Optimization

ExampleDescription
Cost OptimizationOD/Spot mixed pools with weighted priorities and pause-pod overprovision

Advanced Scheduling

ExampleDescription
Capacity ReservationPin workloads to On-Demand Capacity Reservations (ODCRs)
Static CapacityFixed fleet of always-on nodes using spec.replicas
Batch JobsProtect long-running jobs from eviction with do-not-disrupt
Disruption BudgetsLimit simultaneous node drains during consolidation

Autoscaling

ExampleDescription
Pod AutoscalingHPA for CPU-based scaling + KEDA for event-driven scaling

Observability

ExampleDescription
ObservabilityCloudWatch Container Insights integration