Configuration
Solution Configuration
This solution can be configured via config.yaml. If you want to use a configuration file with a different name or path, please specify the CDK_CONFIG_PATH environment variable.
The following table lists the configuration parameters and their default values for the CDK template:
| Parameter | Description | Required | Default |
|---|---|---|---|
stackName |
Name of the stack. The name will be added as a prefix of all resource names. | Yes | sdoneks |
modelBucketArn |
S3 bucket for model storage. Models file should be populated into the bucket. This parameter applies to all runtimes. | Yes | "" |
modelsRuntime |
Define Stable diffusion runtime. At least one runtime should be defined. | Yes | See definition below |
modelsRuntime.name |
Name of individual Stable diffusion runtime. | Yes | sdruntime |
modelsRuntime.namespace |
Namespace of individual Stable diffusion runtime. Namespace will be created if not exists. | Yes | default |
modelsRuntime.type |
Type of individual Stable diffusion runtime. Currently only sdwebui or confyui are supported. |
Yes | sdwebui |
modelsRuntime.chartRepository |
Override default helm chart repository. Protocol (oci:// or https://)should be added as a prefix of repository. (Default: https://aws-samples.github.io/stable-diffusion-on-eks/charts/) |
No | N/A |
modelsRuntime.chartVersion |
Override version of helm chart. (Default: 1.0.0) | No | N/A |
modelsRuntime.modelFilename |
(For SD Web UI only) Filename of model using in the runtime. Filename should be in .ckpt or .safetensors format. Filename should be quoted if contains number only. |
No | v1-5-pruned-emaonly.safetensors |
modelsRuntime.dynamicModel |
(For SD Web UI only) Switch to allow model be loaded by request. | No | false |
modelsRuntime.extraValues |
Extra parameter passed to the runtime. See values definition for detail. | No | N/A |
Runtime Configuration
Stable diffusion runtimes are deployed via Helm Charts. You can configure individual runtime parameters via modelsRuntime.extraValues.
Please note that some parameters marked as Populated by CDK cannot be changed, as their values are automatically generated by CDK, and any manually set values will be overridden.
| Parameter | Description | Default |
|---|---|---|
| Global | ||
global.awsRegion |
AWS region where the stack resides. Not changable. | Populated by CDK |
global.stackName |
Name of CDK stack. Not changable. | Populated by CDK |
| Karpenter Provisioner | ||
karpenter.provisioner.labels |
Labels applied to all nodes. Should be in key-values format. | {} |
karpenter.provisioner.capacityType.onDemand |
Allow Karpenter to launch on-demand node. | true |
karpenter.provisioner.capacityType.spot |
Allow Karpenter to create spot node. When provisioner.capacityType.onDemand is true, Karpenter will priortize launching Spot instance. |
true |
karpenter.provisioner.instanceType |
An array of instance types Karpenter can launch. Should only include instance type available in current region. | - "g5.xlarge" |
karpenter.provisioner.extraRequirements |
Additional requirement for Karpenter to choose instance type. | [] |
karpenter.provisioner.extraTaints |
Provisioned nodes will have nvidia.com/gpu:NoSchedule and runtime:NoSchedule taints by default. Use this paremeter for additional taints. |
[] |
karpenter.provisioner.resourceLimits |
Resource limits prevent Karpenter from creating new instances once the limit is exceeded. cpu, memory and nvidia.com/gpu are supported. |
nvidia.com/gpu: 100 |
karpenter.provisioner.consolidation |
Enables consolidation which attempts to removing un-needed nodes and down-sizing those that can't be removed. | true |
| Karpenter Node Template | ||
karpenter.nodeTemplate.securityGroupSelector |
Tagged security groups will be attached to instances. Not changable. | Populated by CDK |
karpenter.nodeTemplate.subnetSelector |
Instances will be launched in tagged subnets. Not changable. | Populated by CDK |
karpenter.nodeTemplate.tags |
Tags applied to all nodes. Should be in key-values format. | {} |
karpenter.nodeTemplate.amiFamily |
OS option for worker nodes. Karpenter will automatically query for the appropriate EKS optimized AMI via AWS Systems Manager (SSM). AL2 and Bottlerocket are supported. |
Bottlerocket |
karpenter.nodeTemplate.osVolume |
Control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes. See this for schema. This volume will be attached to /dev/xvda. |
|
karpenter.nodeTemplate.dataVolume |
Control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes. See this for schema. This volume will be attached to /dev/xvdb. Required when using Bottlerocket. |
|
karpenter.nodeTemplate.userData |
UserData that is applied to your worker nodes. See the examples here for format. | "" |
| runtime | ||
runtime.labels |
Labels applied to all resources. Should be in key-values format. | "" |
runtime.annotations |
Annotations applied to stable diffusion runtime. Should be in key-values format. | "" |
runtime.serviceAccountName |
Name of service account used by runtime. Not changable. | Populated by CDK |
runtime.replicas |
Replica count of runtime. | 1 |
runtime.scaling.enabled |
Enable auto scaling by SQS length. | true |
runtime.scaling.queueLength |
Target value for queue length. KEDA will scale pod to ApproximateNumberOfMessage / queueLength replicas. |
10 |
runtime.scaling.cooldownPeriod |
The period (in seconds) to wait after the last trigger reported active before scaling the resource back to minReplicaCount. |
60 |
runtime.scaling.maxReplicaCount |
This setting is passed to the HPA definition that KEDA will create for a given resource and holds the maximum number of replicas of the target resource. | 20 |
runtime.scaling.minReplicaCount |
Minimum number of replicas KEDA will scale the resource down to. | 0 |
runtime.scaling.pollingInterval |
Interval (in seconds) to check each trigger on. | 1 |
runtime.scaling.scaleOnInFlight |
When set to true, not visible (in-flight) messages will be counted in ApproximateNumberOfMessage |
false |
runtime.scaling.extraHPAConfig |
KEDA would feed values from this section directly to the HPA's behavior field. Follow Kubernetes documentation for details. |
{} |
| Stable Diffusion Runtime | ||
runtime.inferenceApi.image.repository |
Image Repository of Runtime. | public.ecr.aws/bingjiao/sd-on-eks/sdwebui |
runtime.inferenceApi.image.tag |
Image tag of Runtime. | latest |
runtime.inferenceApi.modelFilename |
Model filename of Runtime. Not changable. | Populated by CDK |
runtime.inferenceApi.extraEnv |
Extra environment variable for Runtime. Should be in Kubernetes format. | {} |
runtime.inferenceApi.modelMountPath |
Path for model folder inside container. | /opt/ml/code/models |
runtime.inferenceApi.commandArguments |
Additional arguments passed to runtime. | "" |
runtime.inferenceApi.resources |
Resource request and limit for Runtime. | |
| Queue Agent | ||
runtime.queueAgent.image.repository |
Image Repository of queue agent. | sdoneks/queue-agent |
runtime.queueAgent.image.tag |
Image tag of queue agent. | latest |
runtime.queueAgent.extraEnv |
Extra environment variable for queue agent. Should be in Kubernetes format. | {} |
runtime.queueAgent.dynamicModel |
Enable model switch by request. Not changable. | Populated by CDK |
runtime.queueAgent.s3Bucket |
S3 bucket for generated image. Not changable. | Populated by CDK |
runtime.queueAgent.snsTopicArn |
SNS topic for image generate complete notification. Not changable. | Populated by CDK |
runtime.queueAgent.sqsQueueUrl |
SQS queue URL of job queue. Not changable. | Populated by CDK |
runtime.queueAgent.resources |
Resource request and limit for queue agent. | |
runtime.queueAgent.XRay.enabled |
Enable X-ray tracing agent for queue agent. | true |
| Persistence | ||
runtime.persistence.enabled |
Enable presistence of model stroage. | true |
runtime.persistence.labels |
Labels applied to presistence volume. Should be in key-values format. | {} |
runtime.persistence.annotations |
Annotations applied to presistence volume. Should be in key-values format. | {} |
runtime.persistence.storageClass |
Storage class for model storage | s3-model-storage-sc |
runtime.persistence.size |
Size of persistence volume. | 2Ti |
runtime.persistence.accessModes |
Access mode of persistence volume. | ReadWriteMany |