Skip to main content
Source

This page is generated from skills/eks-best-practices/references/cost-optimization.md. Edit the source, not this page.

EKS Cost Optimization

Part of: eks-best-practices Purpose: Cost optimization framework, compute/networking/storage cost strategies, observability cost management, tagging, and cost visibility tools for Amazon EKS


Table of Contents

  1. Cost Optimization Framework
  2. Compute Cost Optimization
  3. Networking Cost Optimization
  4. Storage Cost Optimization
  5. Observability Cost Optimization
  6. Tagging & Cost Visibility

Cost Optimization Framework

AWS Cloud Financial Management (CFM) organizes cost optimization into four pillars:

PillarFocusEKS Actions
SeeMeasurement & accountabilityTag resources, deploy Kubecost, enable Cost Explorer
SaveEliminate waste, optimize purchasingRight-size, Spot/Graviton, consolidation
PlanForecast & budgetTrack unit economics (cost per request/transaction)
RunContinuous improvementFinOps flywheel -- iterate on See/Save/Plan

The "See" pillar comes first because you can't optimize what you can't measure. Start with tagging and cost visibility before pursuing compute or networking savings.

EKS Cost Components

ComponentCost DriverOptimization Lever
EKS control plane$0.10/hour per clusterFewer clusters, multi-tenant
EC2 instancesInstance type + hoursRight-sizing, Spot, Graviton
EBS volumesVolume type + size + IOPSgp3, right-size, cleanup unused
Data transferCross-AZ, internet egressTopology-aware routing, VPC endpoints
Load balancersPer ALB/NLB + LCU/hourConsolidate ingress, shared ALB
NAT GatewayPer GB processed + hourlyVPC endpoints for AWS services
ObservabilityLog ingestion + metric storageFilter, retain selectively, reduce cardinality

Quick Wins

ActionTypical SavingsEffort
Switch to Graviton (arm64)20-40%Low -- rebuild images for arm64
Use Spot for non-critical60-90%Low -- Karpenter handles fallback
Enable Karpenter consolidation20-30%Low -- enable in NodePool
Right-size with VPA recommendations15-30%Medium -- review and apply
Use gp3 instead of gp220% on EBSLow -- update StorageClass
VPC endpoints for ECR/S3Eliminate NAT costsLow -- one-time setup
Topology-aware routing50-80% on cross-AZMedium -- enable topology hints
Reduce log verbosity in prod30-50% on loggingLow -- adjust log levels

Compute Cost Optimization

Compute is typically the largest cost driver for EKS. Optimize in this order:

  1. Right-size workloads -- match requests to actual usage
  2. Reduce unused capacity -- autoscale and consolidate
  3. Optimize capacity types -- Spot, Graviton, Savings Plans

Right-Sizing Workloads

Requests should align with actual utilization. Overprovisioned requests waste capacity -- the largest factor in total cluster costs. Each container (including sidecars) should have its own requests and limits.

Right-sizing tools:

ToolApproachBest For
VPA (recommendation mode)Historical usage analysisPer-deployment recommendations
GoldilocksVPA-based dashboardCluster-wide visibility
KRR (Robusta)Prometheus-based analysisQuick right-sizing across namespaces
KubecostCost-aware recommendationsTying resource changes to dollar savings
# Deploy VPA in recommendation-only mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off" # Recommendations only
# View recommendations
kubectl get vpa app-vpa -o jsonpath='{.status.recommendation.containerRecommendations[*]}' | jq

Right-sizing decision framework:

SignalAction
CPU request >> actual usage (consistently)Reduce CPU request to P95 usage
Memory request >> actual usageReduce memory request to P99 usage + 20% buffer
CPU throttling observedIncrease CPU request (or remove CPU limit)
OOMKilled eventsIncrease memory limit
Pod pending due to resourcesScale nodes or reduce requests

Reducing Unused Capacity

Use HPA to scale pods based on demand, then let node autoscalers remove empty or underutilized nodes. Restrictive PodDisruptionBudgets can block node scale-down -- set minAvailable well below your replica count (e.g. minAvailable: 4 for a 6-pod deployment).

For event-driven scaling (SQS queues, Kafka, CloudWatch metrics), use KEDA instead of HPA's built-in metrics.

Karpenter Consolidation

Karpenter continuously monitors and bin-packs workloads onto fewer, optimally-sized instances:

# Enable consolidation in NodePool
spec:
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 30s # Fast for non-prod; use longer (e.g. 5m) in prod

Karpenter selects the most cost-effective instance from your allowed types. It replaces underutilized nodes with smaller ones and removes empty nodes automatically.

For workloads that shouldn't be interrupted (long batch jobs without checkpointing), use the karpenter.sh/do-not-disrupt: "true" annotation.

See also: Karpenter Reference for detailed NodePool configuration, consolidation tuning, and Spot handling.

Cluster Autoscaler Priority Expander

If using Cluster Autoscaler instead of Karpenter, the priority expander lets you prefer cheaper capacity:

# Priority expander ConfigMap -- scale reserved/Spot groups before on-demand
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-priority-expander
namespace: kube-system
data:
priorities: |-
10:
- .*ondemand.*
50:
- .*reserved.*

Also consider the Kubernetes Descheduler alongside CAS -- it rebalances pod placement after scheduling to improve cluster-wide utilization, which CAS alone does not do.

Graviton (arm64) Migration

Graviton instances deliver 20-40% better price/performance than equivalent x86:

# Karpenter NodePool supporting both architectures
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64", "arm64"] # Karpenter prefers cheaper Graviton
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]

DO:

  • Build multi-arch container images (docker buildx)
  • Test on arm64 in staging before production
  • Use Karpenter -- it automatically selects the most cost-effective architecture

DON'T:

  • Assume all container images support arm64 (check base images)
  • Mix architectures within a single deployment without affinity rules

Spot Instance Strategies

# Karpenter: Diversified Spot strategy
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"] # Multiple families
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ["5"] # Current gen only
- key: karpenter.k8s.aws/instance-size
operator: In
values: ["large", "xlarge", "2xlarge"] # Multiple sizes

Spot suitability:

Workload TypeSpot Suitable?Notes
Stateless web/APIYesUse with PDBs + multi-AZ
Batch processingYesIdeal -- tolerant of interruption
CI/CD runnersYesShort-lived, easily retried
Development/testYesCost savings, acceptable disruption
Databases/statefulNoUse On-Demand for data safety
Single-replica criticalNoNo fallback on interruption
Long-running ML trainingMaybeUse checkpointing + Spot

Karpenter handles Spot interruptions automatically (receives 2-min notice, cordons, launches replacement, drains respecting PDBs). For MNG/self-managed nodes, deploy AWS Node Termination Handler.

Savings Plans & Reserved Instances

For stable, predictable baseline capacity, Compute Savings Plans provide up to 66% savings over On-Demand. Layer them with Spot for variable workloads:

  • Baseline (always running): Savings Plans or Reserved Instances
  • Variable (scales up/down): Spot with On-Demand fallback

Downscaling Patterns

# KEDA cron-based scaling -- scale to zero at night
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: dev-app
spec:
scaleTargetRef:
name: dev-app
minReplicaCount: 0
maxReplicaCount: 5
triggers:
- type: cron
metadata:
timezone: America/New_York
start: "0 8 * * 1-5" # Scale up at 8 AM weekdays
end: "0 20 * * 1-5" # Scale down at 8 PM weekdays
desiredReplicas: "3"

When all pods are evicted from a node, Karpenter removes it. Combined with HPA/KEDA scaling pods to zero, nodes automatically scale to zero.


Networking Cost Optimization

Cross-AZ data transfer is a significant cost in multi-AZ EKS clusters. AWS charges for data crossing AZ boundaries, so keeping traffic local reduces costs.

Pod-to-Pod Traffic: Topology-Aware Routing

By default, kube-proxy distributes traffic across all pods regardless of AZ placement, causing cross-AZ charges.

Topology-aware routing (beta) allocates endpoints proportionally across zones:

apiVersion: v1
kind: Service
metadata:
name: orders-service
annotations:
service.kubernetes.io/topology-mode: Auto
spec:
selector:
app: orders
type: ClusterIP

Works best with evenly distributed workloads. Use with pod topology spread constraints to keep replicas balanced across zones. Hints may not be assigned when capacity fluctuates across zones (e.g., with Spot instances).

Traffic Distribution (GA in K8s 1.33) is a simpler, more predictable alternative:

apiVersion: v1
kind: Service
metadata:
name: orders-service
spec:
trafficDistribution: PreferClose
selector:
app: orders
type: ClusterIP

PreferClose routes to same-zone endpoints first, falling back to any endpoint when none are local. Can overload endpoints in high-traffic zones -- mitigate with per-zone deployments with independent HPAs, or topology spread constraints.

Service Internal Traffic Policy restricts traffic to the originating node:

spec:
internalTrafficPolicy: Local

Use for tightly coupled services with frequent inter-communication. Requires co-located replicas via pod affinity rules -- traffic is dropped when no local endpoint exists. Cannot be combined with topology-aware routing.

Load Balancer to Pod Communication

The AWS Load Balancer Controller supports two traffic modes:

ModePathCross-AZ Cost
Instance modeLB -> NodePort -> kube-proxy -> PodLikely cross-AZ hops
IP modeLB -> Pod directlyNo extra hops

Use IP mode to eliminate data transfer charges from LB-to-Pod traffic. Ensure the LB is deployed across all subnets in your VPC.

Network Cost Quick Reference

StrategySavingsEffort
IP mode on ALB/NLBEliminates LB-to-Pod cross-AZ chargesLow
Topology-aware routing / Traffic DistributionReduces cross-AZ pod-to-pod trafficMedium
Gateway VPC endpoints (S3, DynamoDB)Free -- no hourly or data transfer costLow
Interface VPC endpoints (ECR, STS)Avoids NAT Gateway data processing ($0.045/GB)Low
NAT Gateway per AZEliminates inter-AZ NAT traversalLow
In-region ECR pullsFree (vs cross-region data transfer)Low

For detailed networking configuration, see: Networking Reference | Networking -- Ingress & DNS


Storage Cost Optimization

Ephemeral Storage

OptionCostBest For
gp3 root volume~20% less than gp2Default choice for node root volumes
EC2 instance storesNo additional costCaches, scratch space, temporary data

Instance stores are physically attached to the host -- free, but data is lost on termination. Use HostPath or the Local Persistent Volume Static Provisioner to expose them in Kubernetes.

Persistent Volumes: EBS

Start with gp3 -- 20% cheaper per GB than gp2 and allows independent IOPS/throughput scaling:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3
provisioner: ebs.csi.aws.com
parameters:
type: gp3
fsType: ext4
volumeBindingMode: WaitForFirstConsumer

Migration paths from gp2 to gp3:

MethodDowntimeRequires
CSI Volume Snapshots (backup + restore)YesEBS CSI driver
PVC annotation modificationNoEBS CSI driver >= v1.19
VolumeAttributesClass APINoEKS >= 1.31 + EBS CSI >= 1.35

For mission-critical workloads needing >16K IOPS or >1 GiB/s throughput, use io2 Block Express (up to 256K IOPS, 4 GiB/s, 64 TiB).

Dynamically resize volumes as data grows rather than overprovisioning upfront. Use AWS Trusted Advisor or Popeye to find dangling/unused volumes.

Persistent Volumes: EFS

EFS charges only for stored data with no upfront provisioning. Use Intelligent-Tiering to automatically move infrequently accessed files to cheaper storage (up to 92% savings).

Storage ClassCostUse When
EFS StandardHighestFrequently accessed, multi-AZ
EFS Standard-IA~92% lessInfrequently accessed, multi-AZ
EFS One Zone~47% less than StandardSingle-AZ tolerance, frequent access
EFS One Zone-IALowestSingle-AZ, infrequent access

EFS lifecycle policies and Intelligent-Tiering must be configured outside the CSI driver (console or EFS API).

Persistent Volumes: FSx

OptionBest ForKey Advantage
FSx for LustreML training, HPC, video processingSub-ms latency, hundreds of GB/s throughput
FSx for NetApp ONTAPMulti-protocol (NFS/SMB/iSCSI)Data tiering between SSD and capacity pool

For FSx for Lustre, link to S3 for long-term storage -- lazy-load data into Lustre for processing, write results back to S3, then delete the filesystem.

Storage Quick Reference

StrategySavings
gp3 over gp220% lower $/GB, independent IOPS/throughput
EFS Intelligent-TieringUp to 92% on infrequently accessed files
Instance store for cachesZero additional cost (ephemeral)
Container image optimizationDistroless/scratch base images, multi-stage builds
Clean up dangling volumesDirect savings -- Popeye or AWS Trusted Advisor
EBS snapshot retention policyAvoid unbounded snapshot growth via DLM or Velero TTL

Observability Cost Optimization

Observability costs scale with data volume. Optimize by collecting only what matters and retaining intelligently.

Logging

Control plane logs: Evaluate which log types are needed per environment. Non-production clusters may only need API server logs enabled selectively. Production clusters benefit from all types for incident investigation. EKS control plane logs are classified as Vended Logs with volume discount pricing.

StrategyImpact
Selective log types per environmentReduce ingestion volume
Stream to S3 via CloudWatch subscriptionsCheaper long-term storage
Forward non-critical logs directly to S3 (FluentBit)Skip CloudWatch entirely
Reduce log levels (ERROR in prod, DEBUG in dev)Significant volume reduction
Filter Kubernetes metadata in FluentBitRemove unnecessary enrichment

Metrics

StrategyImpact
Monitor only what matters (work backwards from KPIs)Fewer metrics = lower storage cost
Reduce cardinality (drop unnecessary labels)Fewer unique time series
Tune scrape intervals (15s -> 30s/60s for non-critical)50-75% fewer data points
Use recording rules for pre-aggregationReplace high-cardinality queries

Identify high-cardinality offenders:

# Top 5 scrape targets by metric count
topk_max(5, max_over_time(scrape_samples_scraped[1h]))

# Top 5 by churn rate (new series created per scrape)
topk_max(5, max_over_time(scrape_series_added[1h]))

Use Grafana Mimirtool to find metrics collected but never used in dashboards or alerts.

Traces

For high-volume services, implement sampling strategies:

  • Head-based sampling: Decide at trace start (simple, but may miss important traces)
  • Tail-based sampling: Decide after trace completes (captures errors/slow requests, more complex)

Use the ADOT Collector's tail sampling processor to retain only traces that exceed latency thresholds or contain errors.


Tagging & Cost Visibility

Tagging Strategy

Tags are the foundation of cost allocation. Without them, you can see total spend but not who's spending it.

# Karpenter EC2NodeClass tags
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: default
spec:
tags:
Team: platform
CostCenter: engineering
Environment: production
ManagedBy: karpenter
kubernetes.io/cluster/my-cluster: owned
ResourceTag SourceNotes
EC2 instancesNodePool/EC2NodeClass tagsKarpenter applies automatically
EBS volumesStorageClass tagsSet tagSpecification in CSI driver
ALB/NLBService/Ingress annotationsVia AWS LBC
VPC endpointsTerraform/CloudFormationTag at creation

AWS resource tags don't directly correlate with Kubernetes labels. Use Kubernetes labels on pods/namespaces for in-cluster cost attribution (via Kubecost), and AWS tags for billing/Cost Explorer views.

Cost Visibility Tools

ToolScopeCostBest For
AWS Cost ExplorerAccount-levelFreeHigh-level trends, SP/RI recommendations
KubecostCluster-levelFree (open source)Per-namespace/pod cost allocation
CloudWatch Container InsightsCluster + pod~$0.30/container/monthResource utilization monitoring
AWS Billing + tagsAccount-levelFreeChargeback by team/project
Karpenter metricsNode-levelFreeConsolidation efficiency

Kubecost

Kubecost provides real-time cost monitoring, namespace/label allocation, and right-sizing recommendations:

helm install kubecost kubecost/cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostProductConfigs.clusterName=my-cluster \
--set kubecostProductConfigs.cloudIntegrationSecret=cloud-integration
FeatureFreeEnterprise
Namespace/label cost allocationYesYes
Right-sizing recommendationsYesYes
Idle cost detectionYesYes
Multi-clusterNoYes
SSO/RBACNoYes
Long-term storage15 daysUnlimited

DO:

  • Use namespace-level cost allocation for multi-tenant clusters
  • Enable CUR integration for accurate AWS pricing (not list price estimates)
  • Set up idle cost alerts -- unused resources are the biggest waste
  • Use right-sizing recommendations to adjust resource requests

DON'T:

  • Rely solely on CloudWatch for K8s cost attribution -- it lacks namespace-level granularity
  • Skip resource requests on pods -- Kubecost needs requests to calculate allocation
  • Ignore shared costs (control plane, monitoring, ingress) -- allocate proportionally

Sources: