Configuration
Solution Configuration
This solution can be configured via config.yaml
. If you want to use a configuration file with a different name or path, please specify the CDK_CONFIG_PATH
environment variable.
The following table lists the configuration parameters and their default values for the CDK template:
Parameter | Description | Required | Default |
---|---|---|---|
stackName |
Name of the stack. The name will be added as a prefix of all resource names. | Yes | sdoneks |
modelBucketArn |
S3 bucket for model storage. Models file should be populated into the bucket. This parameter applies to all runtimes. | Yes | "" |
modelsRuntime |
Define Stable diffusion runtime. At least one runtime should be defined. | Yes | See definition below |
modelsRuntime.name |
Name of individual Stable diffusion runtime. | Yes | sdruntime |
modelsRuntime.namespace |
Namespace of individual Stable diffusion runtime. Namespace will be created if not exists. | Yes | default |
modelsRuntime.type |
Type of individual Stable diffusion runtime. Currently only sdwebui or confyui are supported. |
Yes | sdwebui |
modelsRuntime.chartRepository |
Override default helm chart repository. Protocol (oci:// or https:// )should be added as a prefix of repository. (Default: https://aws-samples.github.io/stable-diffusion-on-eks/charts/ ) |
No | N/A |
modelsRuntime.chartVersion |
Override version of helm chart. (Default: 1.0.0) | No | N/A |
modelsRuntime.modelFilename |
(For SD Web UI only) Filename of model using in the runtime. Filename should be in .ckpt or .safetensors format. Filename should be quoted if contains number only. |
No | v1-5-pruned-emaonly.safetensors |
modelsRuntime.dynamicModel |
(For SD Web UI only) Switch to allow model be loaded by request. | No | false |
modelsRuntime.extraValues |
Extra parameter passed to the runtime. See values definition for detail. | No | N/A |
Runtime Configuration
Stable diffusion runtimes are deployed via Helm Charts. You can configure individual runtime parameters via modelsRuntime.extraValues
.
Please note that some parameters marked as Populated by CDK
cannot be changed, as their values are automatically generated by CDK, and any manually set values will be overridden.
Parameter | Description | Default |
---|---|---|
Global | ||
global.awsRegion |
AWS region where the stack resides. Not changable. | Populated by CDK |
global.stackName |
Name of CDK stack. Not changable. | Populated by CDK |
Karpenter Provisioner | ||
karpenter.provisioner.labels |
Labels applied to all nodes. Should be in key-values format. | {} |
karpenter.provisioner.capacityType.onDemand |
Allow Karpenter to launch on-demand node. | true |
karpenter.provisioner.capacityType.spot |
Allow Karpenter to create spot node. When provisioner.capacityType.onDemand is true, Karpenter will priortize launching Spot instance. |
true |
karpenter.provisioner.instanceType |
An array of instance types Karpenter can launch. Should only include instance type available in current region. | - "g5.xlarge" |
karpenter.provisioner.extraRequirements |
Additional requirement for Karpenter to choose instance type. | [] |
karpenter.provisioner.extraTaints |
Provisioned nodes will have nvidia.com/gpu:NoSchedule and runtime:NoSchedule taints by default. Use this paremeter for additional taints. |
[] |
karpenter.provisioner.resourceLimits |
Resource limits prevent Karpenter from creating new instances once the limit is exceeded. cpu , memory and nvidia.com/gpu are supported. |
nvidia.com/gpu: 100 |
karpenter.provisioner.consolidation |
Enables consolidation which attempts to removing un-needed nodes and down-sizing those that can't be removed. | true |
Karpenter Node Template | ||
karpenter.nodeTemplate.securityGroupSelector |
Tagged security groups will be attached to instances. Not changable. | Populated by CDK |
karpenter.nodeTemplate.subnetSelector |
Instances will be launched in tagged subnets. Not changable. | Populated by CDK |
karpenter.nodeTemplate.tags |
Tags applied to all nodes. Should be in key-values format. | {} |
karpenter.nodeTemplate.amiFamily |
OS option for worker nodes. Karpenter will automatically query for the appropriate EKS optimized AMI via AWS Systems Manager (SSM). AL2 and Bottlerocket are supported. |
Bottlerocket |
karpenter.nodeTemplate.osVolume |
Control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes. See this for schema. This volume will be attached to /dev/xvda . |
|
karpenter.nodeTemplate.dataVolume |
Control the Elastic Block Storage (EBS) volumes that Karpenter attaches to provisioned nodes. See this for schema. This volume will be attached to /dev/xvdb . Required when using Bottlerocket . |
|
karpenter.nodeTemplate.userData |
UserData that is applied to your worker nodes. See the examples here for format. | "" |
runtime | ||
runtime.labels |
Labels applied to all resources. Should be in key-values format. | "" |
runtime.annotations |
Annotations applied to stable diffusion runtime. Should be in key-values format. | "" |
runtime.serviceAccountName |
Name of service account used by runtime. Not changable. | Populated by CDK |
runtime.replicas |
Replica count of runtime. | 1 |
runtime.scaling.enabled |
Enable auto scaling by SQS length. | true |
runtime.scaling.queueLength |
Target value for queue length. KEDA will scale pod to ApproximateNumberOfMessage / queueLength replicas. |
10 |
runtime.scaling.cooldownPeriod |
The period (in seconds) to wait after the last trigger reported active before scaling the resource back to minReplicaCount . |
60 |
runtime.scaling.maxReplicaCount |
This setting is passed to the HPA definition that KEDA will create for a given resource and holds the maximum number of replicas of the target resource. | 20 |
runtime.scaling.minReplicaCount |
Minimum number of replicas KEDA will scale the resource down to. | 0 |
runtime.scaling.pollingInterval |
Interval (in seconds) to check each trigger on. | 1 |
runtime.scaling.scaleOnInFlight |
When set to true , not visible (in-flight) messages will be counted in ApproximateNumberOfMessage |
false |
runtime.scaling.extraHPAConfig |
KEDA would feed values from this section directly to the HPA's behavior field. Follow Kubernetes documentation for details. |
{} |
Stable Diffusion Runtime | ||
runtime.inferenceApi.image.repository |
Image Repository of Runtime. | public.ecr.aws/bingjiao/sd-on-eks/sdwebui |
runtime.inferenceApi.image.tag |
Image tag of Runtime. | latest |
runtime.inferenceApi.modelFilename |
Model filename of Runtime. Not changable. | Populated by CDK |
runtime.inferenceApi.extraEnv |
Extra environment variable for Runtime. Should be in Kubernetes format. | {} |
runtime.inferenceApi.modelMountPath |
Path for model folder inside container. | /opt/ml/code/models |
runtime.inferenceApi.commandArguments |
Additional arguments passed to runtime. | "" |
runtime.inferenceApi.resources |
Resource request and limit for Runtime. | |
Queue Agent | ||
runtime.queueAgent.image.repository |
Image Repository of queue agent. | sdoneks/queue-agent |
runtime.queueAgent.image.tag |
Image tag of queue agent. | latest |
runtime.queueAgent.extraEnv |
Extra environment variable for queue agent. Should be in Kubernetes format. | {} |
runtime.queueAgent.dynamicModel |
Enable model switch by request. Not changable. | Populated by CDK |
runtime.queueAgent.s3Bucket |
S3 bucket for generated image. Not changable. | Populated by CDK |
runtime.queueAgent.snsTopicArn |
SNS topic for image generate complete notification. Not changable. | Populated by CDK |
runtime.queueAgent.sqsQueueUrl |
SQS queue URL of job queue. Not changable. | Populated by CDK |
runtime.queueAgent.resources |
Resource request and limit for queue agent. | |
runtime.queueAgent.XRay.enabled |
Enable X-ray tracing agent for queue agent. | true |
Persistence | ||
runtime.persistence.enabled |
Enable presistence of model stroage. | true |
runtime.persistence.labels |
Labels applied to presistence volume. Should be in key-values format. | {} |
runtime.persistence.annotations |
Annotations applied to presistence volume. Should be in key-values format. | {} |
runtime.persistence.storageClass |
Storage class for model storage | s3-model-storage-sc |
runtime.persistence.size |
Size of persistence volume. | 2Ti |
runtime.persistence.accessModes |
Access mode of persistence volume. | ReadWriteMany |