Skip to content

Configuration

When deploying the project, you will generate a config file. This page aims to explain the options.

Before you start, please read the precautions and security pages.

The configuration will allow you to define what AWS resource to create. To get an overview of what resource is required or optional, please refer to the resource page.

Prefix

Set a prefix to the resource names including the CloudFormation stack name. It is usefull if you plan to deploy this project multiple times in the same AWS account and region.

Use an existing Amazon VPC

Add the project to an existing Amazon Virtual Private Cloud (Amazon VPC). Note the VPC has to have private subnets that can connect to the Internet. (For example when crawling a website to populate a RAG Workspace.)

If enabled, you will need to specify the VpcID. (Can be found in the console or using the CLI).

If disabled, it will create a new VPC with one NAT Gateway and the VPC Flow logs enabled.

Create KMS Customer managed Keys (CMK)

When enabled, the project will create 2 Customer managed Keys that will be used when possitble to encrypt the date at rest.

Retain on Delete

When enabled, every resource storing data will be retained on delete (For example a log group, an S3 bucket, a table...). This means, on cleanup it will skip the deletion of these AWS resources.

This capability is recommended to prevent data deletion.

Enable Amazon Bedrock

If Amazon Bedrock is enabled, the fundation models available in Bedrock will be available. Note, to be usable, the models have to be enabled. Please refer to the models requirements for more information.

Enable Amazon Bedrock Guardrails

Amazon Bedrock Guardrails can be leveraged to implement safeguards when using functional models provided by Amazon Bedrock. To use this feature, you will first need to create and configure a guardrail.

At this time, the Guardrails configuration is not created by the project. Please use the console or create your own configuration using CDK.

Once it is configured, you will need to provide the ID of the Guardrail and the version. If you select DRAFT as a version, it will use the working draft that can be changed without requiring a new deployment.

Use Amazon SageMaker models

Enabling Amazon SageMaker will deploy a SageMaker endpoint for each model selected. For more details about this feature please refer to the self hosted models documentation and the models requirements.

Creating SageMaker endpoints have cost implication because they are not serverless resource and you will need to verify the license requirements of the models you plan to use.

As a cost saving option, the configuration allows you to run the endpoints on a schedule. For more details, please refer to the folliwng page. Please note if you attempt to re-deploy while the endpoints are not running, it will cause a failure.

In addition, if the model source is HuggingFace, it might require authetication. For more detauls please refer to the models requirements.

Please note as an alternative managed by AWS, the project supports AWS Jumpstart for the models Mistral-7B-Instruct-v0.3 and meta-LLama2-13b-chat

Enable Retrieval-augmented generation (RAG)

Enabling this option will allow you to create workspaces, upload documents and websites. When using the Chatbot, a workspace can be used to give more information to the model.

Deploy default embedding and cross-encoder models via SageMaker

When RAG is enabled, you can enable this option to deploy an Amazon SageMaker endpoint providing re-ranking capabilities and embeding generation.

The models available when deploying the default endpoint are intfloat/multilingual-e5-large, sentence-transformers/all-MiniLM-L6-v2 and cross-encoder/ms-marco-MiniLM-L-12-v2. If enable, please consider the cost of using SageMaker and the the models requirements.

For more information about this default endpoint and how to update the models, please refer to the following page.

Select RAG Workspaces engines

Four engines are available by default (each of them is optional)

If Amazon Aurora or OpenSearch is selected, you will also need to select a default embeding models to generate the vectors. To use a serverless option, Amazon Bedrock support the Titan embeding models.

For more details, please refer to the document retrieval which explain how to add additional engines.

Advanced settings

API Throttling

To protect the environment against sudden traffic increase, the project throttle incoming requests by IP using AWS WAF rate limit rules. As part of the configuration, you can select 2 threholds:

  • Rate limit per IP on the SendQuery mutation invoking the Large Language models.
  • Rate limit per IP on all the GraphQL APIs.

Please note the throttling rules are based on the IP. Theses limits could be an issue if you users are all using the same IP.

Log retention

Defines how long the application and access logs are retained. For more information about logging.

Advanced monitoring

When enabled, it will create alarms, models metrics and enable AWS X-Ray. For more information, please refer to the monitoring page

Create VPC Endpoints

A VPC endpoint allows you to privately connect your VPC to supported AWS services. By enabling this option, it will create VPC Endpoints for the following AWS services: S3, DynamoDB, Secret Manager, SageMaker, AppSync, Lambda, SNS, Step Functions, SSM, KMS, Bedrock, Kendra, RDS, ECS, Batch, EC2.

Using this capability improves the security and could reduce the cost based on your usage. However, VPC endpoints are persitent resources, please consider the cost when using this capability.

Private Website (Front-End only)

By enabling this setting, it will deploy the front-end inside of the VPC using an internal Application Load Balancer similar to the solution described here. This option can be relevant if you plan to access the chatbot privately (Via a VPN for example).

Please note if it is disable, it will deploy the front end to Amazon CloudFront

For more information about this feature and how to set it up, please refer to the following page.

Custom Domain

When using this change, it will attach a certificate to either the Amazon CloudFront disribution or the Application Load Balancer (if private website is used).

To use this capability, you will first need to create the certificate and provide its ARN once it is active.

Then you will need to create a DNS record to point the resource. If Amazon Cloudfront is used please refer to the documentation. If a private website is used, please refer to this documentation.

Please refer to the documentation for more details

Geo restrinction

When Private Website is disabled, you can restrict access per location using Amazon CloudFront capability. For more detais, please refer to the geo restriction page.

Cognito Federation

The project relies on Amazon Cognito and support federation using external identity providers (using OpenID Connect (OIDC) and SAML 2.0). To enable this feature, please refer to the federation page.

This library is licensed under the MIT-0 License.