Skip to content

Configuration

Solution Configuration

This solution can be configured via config.yaml. If you want to use a configuration file with a different name or path, please specify the CDK_CONFIG_PATH environment variable.

The following table lists the configuration parameters and their default values for the CDK template:

Parameter Description Required Default
stackName Name of the stack. The name will be added as a prefix of all resource names. Yes sdoneks
modelBucketArn S3 bucket for model storage. Models file should be populated into the bucket. This parameter applies to all runtimes. Yes ""
modelsRuntime Define Stable diffusion runtime. At least one runtime should be defined. Yes See definition below
modelsRuntime.name Name of individual Stable diffusion runtime. Yes sdruntime
modelsRuntime.namespace Namespace of individual Stable diffusion runtime. Namespace will be created if not exists. Yes default
modelsRuntime.type Type of individual Stable diffusion runtime. Currently only sdwebui or confyui are supported. Yes sdwebui
modelsRuntime.chartRepository Override default helm chart repository. Protocol (oci:// or https://)should be added as a prefix of repository. (Default: https://aws-samples.github.io/stable-diffusion-on-eks/charts/) No N/A
modelsRuntime.chartVersion Override version of helm chart. (Default: 1.0.0) No N/A
modelsRuntime.modelFilename (For SD Web UI only) Filename of model using in the runtime. Filename should be in .ckpt or .safetensors format. Filename should be quoted if contains number only. No v1-5-pruned-emaonly.safetensors
modelsRuntime.dynamicModel (For SD Web UI only) Switch to allow model be loaded by request. No false
modelsRuntime.extraValues Extra parameter passed to the runtime. See values definition for detail. No N/A

Runtime Configuration

Stable diffusion runtimes are deployed via Helm Charts. You can configure individual runtime parameters via modelsRuntime.extraValues.

Please note that some parameters marked as Populated by CDK cannot be changed, as their values are automatically generated by CDK, and any manually set values will be overridden.

Parameter Description Default
Global
global.awsRegion AWS region where the stack resides. Not changable. Populated by CDK
global.stackName Name of CDK stack. Not changable. Populated by CDK
Karpenter Provisioner
karpenter.provisioner.labels Labels applied to all nodes. Should be in key-values format. {}
karpenter.provisioner.capacityType.onDemand Allow Karpenter to launch on-demand node. true
karpenter.provisioner.capacityType.spot Allow Karpenter to create spot node. When provisioner.capacityType.onDemand is true, Karpenter will priortize launching Spot instance. true
karpenter.provisioner.instanceType An array of instance types Karpenter can launch. Should only include instance type available in current region. - "g5.xlarge"
karpenter.provisioner.extraRequirements Additional requirement for Karpenter to choose instance type. []
karpenter.provisioner.extraTaints Provisioned nodes will have nvidia.com/gpu:NoSchedule and runtime:NoSchedule taints by default. Use this paremeter for additional taints. []
karpenter.provisioner.resourceLimits Resource limits prevent Karpenter from creating new instances once the limit is exceeded. cpu, memory and nvidia.com/gpu are supported. nvidia.com/gpu: 100
karpenter.provisioner.consolidation Enables consolidation which attempts to removing un-needed nodes and down-sizing those that can't be removed. true
Karpenter Node Template
karpenter.nodeTemplate.securityGroupSelector Tagged security groups will be attached to instances. Not changable. Populated by CDK
karpenter.nodeTemplate.subnetSelector Instances will be launched in tagged subnets. Not changable. Populated by CDK
karpenter.nodeTemplate.tags Tags applied to all nodes. Should be in key-values format. {}
karpenter.nodeTemplate.amiFamily OS option for worker nodes. Karpenter will automatically query for the appropriate EKS optimized AMI via AWS Systems Manager (SSM). AL2 and Bottlerocket are supported. Bottlerocket
karpenter.nodeTemplate.osVolume Control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes. See this for schema. This volume will be attached to /dev/xvda.
karpenter.nodeTemplate.dataVolume Control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes. See this for schema. This volume will be attached to /dev/xvdb. Required when using Bottlerocket.
karpenter.nodeTemplate.userData UserData that is applied to your worker nodes. See the examples here for format. ""
runtime
runtime.labels Labels applied to all resources. Should be in key-values format. ""
runtime.annotations Annotations applied to stable diffusion runtime. Should be in key-values format. ""
runtime.serviceAccountName Name of service account used by runtime. Not changable. Populated by CDK
runtime.replicas Replica count of runtime. 1
runtime.scaling.enabled Enable auto scaling by SQS length. true
runtime.scaling.queueLength Target value for queue length. KEDA will scale pod to ApproximateNumberOfMessage / queueLength replicas. 10
runtime.scaling.cooldownPeriod The period (in seconds) to wait after the last trigger reported active before scaling the resource back to minReplicaCount. 60
runtime.scaling.maxReplicaCount This setting is passed to the HPA definition that KEDA will create for a given resource and holds the maximum number of replicas of the target resource. 20
runtime.scaling.minReplicaCount Minimum number of replicas KEDA will scale the resource down to. 0
runtime.scaling.pollingInterval Interval (in seconds) to check each trigger on. 1
runtime.scaling.scaleOnInFlight When set to true, not visible (in-flight) messages will be counted in ApproximateNumberOfMessage false
runtime.scaling.extraHPAConfig KEDA would feed values from this section directly to the HPA's behavior field. Follow Kubernetes documentation for details. {}
Stable Diffusion Runtime
runtime.inferenceApi.image.repository Image Repository of Runtime. public.ecr.aws/bingjiao/sd-on-eks/sdwebui
runtime.inferenceApi.image.tag Image tag of Runtime. latest
runtime.inferenceApi.modelFilename Model filename of Runtime. Not changable. Populated by CDK
runtime.inferenceApi.extraEnv Extra environment variable for Runtime. Should be in Kubernetes format. {}
runtime.inferenceApi.modelMountPath Path for model folder inside container. /opt/ml/code/models
runtime.inferenceApi.commandArguments Additional arguments passed to runtime. ""
runtime.inferenceApi.resources Resource request and limit for Runtime.
Queue Agent
runtime.queueAgent.image.repository Image Repository of queue agent. sdoneks/queue-agent
runtime.queueAgent.image.tag Image tag of queue agent. latest
runtime.queueAgent.extraEnv Extra environment variable for queue agent. Should be in Kubernetes format. {}
runtime.queueAgent.dynamicModel Enable model switch by request. Not changable. Populated by CDK
runtime.queueAgent.s3Bucket S3 bucket for generated image. Not changable. Populated by CDK
runtime.queueAgent.snsTopicArn SNS topic for image generate complete notification. Not changable. Populated by CDK
runtime.queueAgent.sqsQueueUrl SQS queue URL of job queue. Not changable. Populated by CDK
runtime.queueAgent.resources Resource request and limit for queue agent.
runtime.queueAgent.XRay.enabled Enable X-ray tracing agent for queue agent. true
Persistence
runtime.persistence.enabled Enable presistence of model stroage. true
runtime.persistence.labels Labels applied to presistence volume. Should be in key-values format. {}
runtime.persistence.annotations Annotations applied to presistence volume. Should be in key-values format. {}
runtime.persistence.storageClass Storage class for model storage s3-model-storage-sc
runtime.persistence.size Size of persistence volume. 2Ti
runtime.persistence.accessModes Access mode of persistence volume. ReadWriteMany