Spot Workloads on EKS Auto Mode

Overview
Architecture
Implementation Steps
Cleanup
Troubleshooting

Prerequisites

Cluster deployed and kubectl configured per Quick Start.

Overview

Amazon EC2 Spot Instances let you take advantage of unused EC2 capacity at steep discounts. Key benefits include:

💰 Cost Optimization

Up to 90% cost savings compared to On-Demand instances
Ideal for fault-tolerant, flexible workloads
Pay only for what you use

⚡ Scalability

Access to large-scale compute capacity
Perfect for batch processing and stateless applications
Automatic capacity rebalancing

🔄 Flexibility

Mix of instance types and sizes
Automatic instance selection based on availability
Graceful interruption handling

Architecture

This example demonstrates how to run workloads on Spot instances in EKS Auto Mode using Karpenter's spot instance management capabilities.

Key Components: 📄 NodePool Template

Defines Spot instance requirements
Available here
Supports c, m, and r instance families
ARM64 architecture for cost efficiency

🔄 Load Balancer

Application Load Balancer (ALB)
Exposes the application to external traffic

🎮 Sample Application

2048 game (sliding tile puzzle)
Stateless application ideal for spot instances

Implementation Steps

1. Deploy Spot NodePool

Deploy the NodePool that will manage our Spot instances:

kubectl apply -f ../../nodepools/spot-nodepool.yaml

⚠️ The Spot NodePool applies the following taint to ensure workloads are spot-aware:
taints:
  - key: "spot"
    value: "true"
    effect: "NoSchedule"   # Prevents non-spot-aware pods from scheduling
Any pods that need to run on Spot nodes must include matching tolerations in their specifications. This ensures workloads are designed to handle spot instance interruptions.

2. Deploy the 2048 Game

Deploy our spot-compatible 2048 game application:

kubectl apply -f game-2048.yaml

✅ The 2048 game deployment includes the required configuration for Spot instances:
tolerations:
  - key: "spot"     # Matches the Spot node taint
    value: "true"
    effect: "NoSchedule"   # Allows scheduling on tainted nodes

nodeSelector:
  karpenter.sh/capacity-type: spot   # Ensures pods run on spot instances
This configuration ensures the pods can run on Spot instances and are scheduled appropriately.

3. Configure Load Balancer

Set up the Application Load Balancer using Ingress:

kubectl apply -f 2048-ingress.yaml

4. Access the Application

By default, this example exposes its UI via an internal ALB — reachable from inside the VPC only. To access it from your laptop, use kubectl port-forward:

kubectl port-forward -n game-2048-spot svc/service-2048 8080:80
# then open http://localhost:8080

If you want to inspect the ALB DNS name directly (e.g. from a bastion or VPN):

kubectl get ingress ingress-2048 \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' \
  -n game-2048-spot

To expose the UI publicly over HTTPS, deploy the Terraform stack with var.base_domain set to a public Route53 zone you own (see top-level README). The example will be reachable at https://2048-spot.<full_domain> once external-dns publishes the record. 🎮

Cleanup

🧹 Follow these steps to clean up all resources:

Remove the application and node pool:

kubectl delete -f 2048-ingress.yaml
kubectl delete -f game-2048.yaml
kubectl delete -f ../../nodepools/spot-nodepool.yaml

Troubleshooting

🔧 Common issues and their solutions:

🎯 Spot Instance Issues

Capacity Unavailability

Monitor instance capacity with AWS CLI:

aws ec2 describe-spot-instance-requests \
  --filters "Name=status-code,Values=capacity-not-available"

Check NodePool events:
```
kubectl describe nodepool spot-nodepool
```

Instance Interruptions

Monitor interruption events:

kubectl get events --field-selector reason=SpotInterruption

Review pod eviction status:
```
kubectl get pods -n game-2048 -o wide
```

🔄 Load Balancer Issues

ALB Configuration

# Check ALB controller logs
kubectl logs -n kube-system \
  deployment/aws-load-balancer-controller

Ingress Status

# Check ingress status
kubectl describe ingress ingress-2048 -n game-2048

💡 Tip: Use kubectl get events to monitor spot instance lifecycle events and pod rescheduling.

Table of Contents​

Prerequisites​

Overview​

Architecture​

Implementation Steps​

1. Deploy Spot NodePool​

2. Deploy the 2048 Game​

3. Configure Load Balancer​

4. Access the Application​

Cleanup​

Troubleshooting​

🎯 Spot Instance Issues​

🔄 Load Balancer Issues​

Table of Contents