The AI/ML Security Assessment Framework is a serverless, multi-account security assessment solution for AWS AI/ML workloads. It performs 52 core security checks across Amazon Bedrock, Amazon SageMaker AI, and Amazon Bedrock AgentCore, with an optional 64-check Financial Services GenAI risk assessment, generating interactive HTML reports with findings and remediation guidance.



1-aiml-security-member-roles.yaml)AIMLSecurityMemberRole to all target accounts2-aiml-security-codebuild.yaml)MultiAccountCodeBuildRole with cross-account access permissionsMultiAccountListOverrideAIMLSecurityMemberRole in each target accountenableFinServ from the deployment parametersample-aiml-security-assessment/
├── aiml-security-assessment/
│ ├── functions/security/
│ │ ├── bedrock_assessments/ # Bedrock security checks (14)
│ │ ├── sagemaker_assessments/ # SageMaker security checks (25)
│ │ ├── agentcore_assessments/ # AgentCore security checks (13)
│ │ ├── finserv_assessments/ # Optional Financial Services GenAI risk checks (64)
│ │ ├── finserv_tests/ # FinServ-specific unit and coverage tests
│ │ ├── iam_permission_caching/ # AWS IAM permissions cache
│ │ ├── cleanup_bucket/ # Amazon S3 cleanup
│ │ ├── resolve_regions/ # Multi-region resolution Lambda
│ │ └── generate_consolidated_report/ # HTML/CSV report generation
│ ├── statemachine/ # AWS Step Functions definition
│ ├── images/ # SAM application images
│ ├── template.yaml # AWS SAM template (single-account)
│ ├── template-multi-account.yaml # AWS SAM template (multi-account)
│ ├── samconfig.toml # SAM deployment configuration
│ ├── envvars.json # Environment variables for local testing
│ └── testfile.json # Test event file for local invocation
├── deployment/ # AWS CloudFormation templates
├── docs/ # Documentation
│ ├── DEVELOPER_GUIDE.md # This guide
│ ├── SECURITY_CHECKS.md # Security checks reference
│ ├── TROUBLESHOOTING.md # Troubleshooting guide
│ ├── diagrams/ # Architecture diagrams
│ └── icons/ # AWS service icons
├── sample-reports/ # Sample assessment reports
│ ├── scripts/ # Screenshot capture scripts
│ ├── *.html # Sample HTML reports
│ └── *.png # Report screenshots
├── tests/ # Unit tests for assessment functions
│ └── requirements.txt # Test dependencies
├── .github/workflows/ # PR lint, test, SAM validate, and security scans
├── buildspec.yml # AWS CodeBuild orchestration
└── consolidate_html_reports.py # Multi-account report consolidation
# buildspec.yml execution flow
1. Get active accounts from AWS Organizations
2. For each account:
- Assume AIMLSecurityMemberRole
- Deploy AI/ML assessment stack through AWS SAM
- Start AWS Step Functions execution
3. Wait for completion and consolidate results
{
"Comment": "AI/ML Assessment Module",
"StartAt": "Cleanup S3 Bucket",
"States": {
"Cleanup S3 Bucket": {
"Type": "Task",
"Next": "IAM Permission Caching"
},
"IAM Permission Caching": {
"Type": "Task",
"Next": "Resolve Target Regions"
},
"Resolve Target Regions": {
"Type": "Task",
"Comment": "Resolves target regions from TARGET_REGIONS env var",
"Next": "Scan Regions"
},
"Scan Regions": {
"Type": "Map",
"ItemsPath": "$.ResolvedRegions.regions",
"MaxConcurrency": "${MaxRegionConcurrency}",
"ItemProcessor": {
"ProcessorConfig": {"Mode": "INLINE"},
"StartAt": "Run Security Assessments",
"States": {
"Run Security Assessments": {
"Type": "Parallel",
"Branches": [
{"StartAt": "Bedrock Security Assessment", "States": {...}},
{"StartAt": "Sagemaker Security Assessment", "States": {...}},
{"StartAt": "AgentCore Security Assessment", "States": {...}},
{
"StartAt": "FinServ Enabled?",
"States": {
"FinServ Enabled?": {
"Type": "Choice",
"Comment": "Runs FinServ only when enableFinServ is true and RegionIndex is 0"
},
"FinServ Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true},
"FinServ Assessment Skipped": {"Type": "Pass", "End": true}
}
}
],
"End": true
}
}
},
"Next": "Generate Consolidated Report"
},
"Generate Consolidated Report": {
"Type": "Task",
"End": true
}
}
}
The framework includes 52 core security checks across three AI/ML services, plus 64 optional Financial Services GenAI risk checks when EnableFinServAssessment is enabled. For the complete list of checks with descriptions, see the Security Checks Reference.
Each core service assessment AWS Lambda function:
region_name parameterRegion column)The Financial Services assessment Lambda is different. It is deployed in both SAM templates, but Step Functions invokes it only when the execution input includes "enableFinServ": "true" and only from the first region iteration (RegionIndex == 0). It receives the full TargetRegions list and emits FinServ findings with Region values so the report can display the same regional filters as the core services.
Additional Functions:
TargetRegions parameter for the Map stateTo add a new AI/ML service (for example, Amazon Comprehend, Amazon Textract):
# Example: Adding Comprehend security assessment
mkdir -p aiml-security-assessment/functions/security/comprehend_assessments
cd aiml-security-assessment/functions/security/comprehend_assessments
import boto3 import os import json from botocore.config import Config from botocore.exceptions import ClientError, EndpointConnectionError from schema import create_finding
boto3_config = Config(retries=dict(max_attempts=10, mode=”adaptive”))
def lambda_handler(event, context): “"”Main assessment handler for new service””” all_findings = []
# Extract target region from Step Functions Map state
region = event.get("Region", os.environ.get("AWS_REGION", "us-east-1"))
# Verify service availability in this region
try:
test_client = boto3.client("comprehend", config=boto3_config, region_name=region)
test_client.list_endpoints(MaxResults=1)
except (EndpointConnectionError, Exception) as e:
if "Could not connect to the endpoint URL" in str(e):
# Service not available — return N/A finding
...
return {"statusCode": 200, "body": {"message": f"Service not available in {region}"}}
# Get cached permissions
execution_id = event["Execution"]["Name"]
permission_cache = get_permissions_cache(execution_id)
# Run assessment checks (pass region to each)
findings = check_new_service_security(permission_cache, region=region)
all_findings.append(findings)
# Generate and upload report (include region in S3 key)
csv_content = generate_csv_report(all_findings)
bucket_name = os.environ.get("AIML_ASSESSMENT_BUCKET_NAME")
s3_url = write_to_s3(execution_id, csv_content, bucket_name, region=region)
return {
"statusCode": 200,
"body": {
"message": "New service assessment completed",
"findings": all_findings,
"report_url": s3_url,
},
}
def check_new_service_security(permission_cache, region: str = “”): “"”Implement your security checks here””” findings = { “check_name”: “New Service Security Check”, “status”: “PASS”, “details”: “”, “csv_data”: [], }
# Create regional client
client = boto3.client("comprehend", config=boto3_config, region_name=region)
# Your assessment logic here
# Pass region= to all create_finding() calls
return findings ```
# requirements.txt
boto3>=1.26.0
botocore>=1.29.0
from enum import Enum
class SeverityEnum(str, Enum): HIGH = “High” MEDIUM = “Medium” LOW = “Low” INFORMATIONAL = “Informational”
class StatusEnum(str, Enum): FAILED = “Failed” PASSED = “Passed” NA = “N/A”
def create_finding( check_id, finding_name, finding_details, resolution, reference, severity, status, region=”” ): “"”Create standardized finding format
Args:
check_id: Unique check identifier (for example, SM-01, BR-01, AC-01)
finding_name: Name of the finding
finding_details: Detailed description
resolution: Steps to resolve (empty string for N/A status)
reference: Documentation URL
severity: SeverityEnum value
status: StatusEnum value (Failed, Passed, or N/A)
region: AWS region where the finding was identified
"""
return {
"Check_ID": check_id,
"Finding": finding_name,
"Finding_Details": finding_details,
"Resolution": resolution,
"Reference": reference,
"Severity": severity,
"Status": status,
"Region": region,
} ```
Add your new function to both SAM templates:
aiml-security-assessment/template.yamlaiml-security-assessment/template-multi-account.yaml ComprehendSecurityAssessmentFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: !Sub 'aiml-security-${AWS::StackName}-ComprehendAssessment'
CodeUri: functions/security/comprehend_assessments/
Handler: app.lambda_handler
Runtime: python3.12
Timeout: 600
MemorySize: 1024
Environment:
Variables:
AIML_ASSESSMENT_BUCKET_NAME: !Ref AIMLAssessmentBucket
TARGET_REGIONS: !Ref TargetRegions
Policies:
- S3CrudPolicy:
BucketName: !Ref AIMLAssessmentBucket
- Statement:
- Sid: ComprehendReadPermissions
Effect: Allow
Action:
- comprehend:List*
- comprehend:Describe*
- comprehend:Get*
Resource: '*'
Add the new service to the Run Security Assessments parallel branch inside the Scan Regions Map state in aiml-security-assessment/statemachine/assessments.asl.json. Also add the function ARN substitution and LambdaInvokePolicy for the new function in both SAM templates.
{
"Parallel Service Assessments": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "Bedrock Security Assessment",
"States": {"Bedrock Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
},
{
"StartAt": "SageMaker Security Assessment",
"States": {"SageMaker Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
},
{
"StartAt": "AgentCore Security Assessment",
"States": {"AgentCore Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
},
{
"StartAt": "Comprehend Security Assessment",
"States": {"Comprehend Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
}
]
}
}
Add required permissions to every role that may deploy or run the new service assessment:
In deployment/1-aiml-security-member-roles.yaml:
- Effect: Allow
Action:
- comprehend:List*
- comprehend:Describe*
- comprehend:Get*
Resource: '*'
In deployment/aiml-security-single-account.yaml (for single account mode):
- comprehend:List*
- comprehend:Describe*
- comprehend:Get*
In deployment/2-aiml-security-codebuild.yaml (for management-account multi-account mode):
- comprehend:List*
- comprehend:Describe*
- comprehend:Get*
Also add runtime permissions to the new Lambda role statements in both SAM templates if the new service function needs service-specific access at execution time.
Test your new assessment function locally:
cd aiml-security-assessment
sam build --template template.yaml
sam local invoke ComprehendSecurityAssessmentFunction --event testfile.json
create_finding() function for consistent outputPassed: Resources were checked and met the assessed best practiceFailed: Resources were checked and found non-compliantN/A: No resources exist to check (for example, “No notebooks found”, “No guardrails configured”)High: Critical security issues requiring immediate attentionMedium: Important security improvements recommendedLow: Minor optimizations suggestedInformational: Advisory information, no action requiredN/A: Check not applicable (typically paired with N/A status)try:
# Assessment logic
result = aws_client.describe_service()
except ClientError as e:
if e.response["Error"]["Code"] == "AccessDenied":
# Handle permission issues
logger.warning(f"Access denied for service check: {str(e)}")
return create_finding(
finding_name="Permission Check",
finding_details="Insufficient permissions to assess service",
resolution="Grant required permissions to assessment role",
reference="https://docs.aws.amazon.com/service/permissions",
severity="High",
status="Failed",
)
else:
# Handle other AWS errors
logger.error(f"AWS API error: {str(e)}")
raise
except Exception as e:
# Handle unexpected errors
logger.error(f"Unexpected error: {str(e)}", exc_info=True)
raise
# Test an individual SAM function
cd aiml-security-assessment
sam build --template template.yaml
sam local invoke NewServiceSecurityAssessmentFunction --event test-event.json
# Deploy to test account
sam deploy --stack-name aiml-security-test --capabilities CAPABILITY_IAM
# Execute AWS Step Functions
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:region:account:stateMachine:TestStateMachine \
--input '{"accountId":"123456789012","enableFinServ":"false"}'
For detailed troubleshooting guidance, common issues, and debugging tips, see the Troubleshooting Guide.
MaxRegionConcurrencyEnableFinServAssessment; the Lambda is deployed by default but invoked only when enabledReport generation uses a single shared template (report_template.py) for both deployment modes:
aiml-security-assessment/functions/security/generate_consolidated_report/
├── app.py # Lambda handler (single-account)
├── report_template.py # Shared HTML/CSS/JS template
└── ...
consolidate_html_reports.py # CodeBuild script (multi-account)
| Component | Mode | Description |
|---|---|---|
app.py (AWS Lambda) |
mode='single' |
Generates per-account HTML reports during AWS Step Functions execution |
consolidate_html_reports.py |
mode='multi' |
Consolidates all account reports in AWS CodeBuild post-build phase |
Both call generate_html_report() from report_template.py with different parameters.
To update report styling, layout, or features:
report_template.py only - changes apply to both single and multi-account reportspython -m pytest test_generate_report.py -vget_html_template() - HTML/CSS/JS structuregenerate_table_rows() - Finding row generationgenerate_html_report() - Main entry point with mode parameter (‘single’ or ‘multi’)When you modify the report template or add new features, update the sample reports and screenshots:
After making changes to report_template.py, regenerate sample reports from a fresh assessment run or from the local report test fixtures. The existing test_generate_report.py file is a pytest/unittest test module, not a standalone --mode/--output CLI.
# Generate local viewable reports from fixtures
cd aiml-security-assessment/functions/security/generate_consolidated_report
python -m pytest test_generate_report.py -k "generate_viewable_report or generate_multi_account_report" -s
The fixture reports are written under aiml-security-assessment/functions/security/generate_consolidated_report/test_reports/. Use them to validate report rendering before refreshing the canonical files in sample-reports/.
The repository includes an automated screenshot capture tool:
# Activate virtual environment
source .venv/bin/activate
# Install dependencies (first time only)
pip install -r sample-reports/dev-requirements.txt
playwright install chromium
# Capture and optimize screenshots
python sample-reports/scripts/capture_screenshots.py
What the script does:
sample-reports/ folderWhat gets generated:
The script captures 4 screenshots:
dashboard-overview-light.png - Executive dashboard in light modedashboard-overview-dark.png - Executive dashboard in dark modefindings-table.png - Detailed findings table with filtersmulti-account-summary.png - Multi-account consolidated viewAll screenshots are automatically optimized (target: 200-300KB each, ~600KB total).
Customization:
Edit sample-reports/scripts/capture_screenshots.py to customize:
# Viewport size
VIEWPORT_WIDTH = 1440
VIEWPORT_HEIGHT = 900
# Image quality
JPEG_QUALITY = 85 # Range: 1-100
PNG_OPTIMIZE = True
# Add new screenshots to SCREENSHOTS list
SCREENSHOTS = [
{
"name": "my-screenshot",
"file": "security_assessment_single_account.html",
"description": "My Custom View",
"actions": [
{"type": "wait", "selector": ".element", "timeout": 2000},
{"type": "click", "selector": ".button"},
{"type": "scroll", "position": 500},
],
"clip": {"x": 0, "y": 0, "width": 1440, "height": 800},
}
]
Available action types:
wait - Wait for selector (for example, {"type": "wait", "selector": ".metrics", "timeout": 2000})click - Click element (for example, {"type": "click", "selector": ".theme-toggle"})scroll - Scroll to position (for example, {"type": "scroll", "position": 500})wait_time - Wait milliseconds (for example, {"type": "wait_time", "ms": 300})Troubleshooting:
| Issue | Solution |
|---|---|
playwright not installed |
pip install playwright && playwright install chromium |
| Sample reports not found | Run from repository root |
| Screenshots too large | Lower JPEG_QUALITY or reduce viewport size |
| Browser launch fails | Run playwright install-deps (Linux only) |
After generating new screenshots, update the README to reference them:
### Sample Assessment Reports
**Preview:**

*Executive summary with severity counts and service breakdown*

*Interactive findings table with filtering capabilities*
dashboard-overview-light.png, not screenshot1.pngsample-reports/ for easy organizationGitHub Actions workflows run automatically to validate code quality and security on every pull request.
| Workflow | File | What It Checks |
|---|---|---|
| Python Code Quality | .github/workflows/python-lint.yml |
ruff check (lint) and ruff format --check (formatting) on changed .py files |
| Python Tests | .github/workflows/python-tests.yml |
Runs upstream tests, FinServ tests, and report-pipeline tests in separate pytest sessions |
| CloudFormation Lint | .github/workflows/cfn-lint.yml |
Validates deployment and SAM templates with cfn-lint |
| SAM Validate & Build | .github/workflows/sam-validate.yml |
Runs sam validate --lint and sam build on SAM templates |
| ASH Security Scan | .github/workflows/ash-security-scan.yml |
Scans changed files for secrets, dependency vulnerabilities, and IaC misconfigurations |
Additional workflows run post-merge or on schedule:
| Workflow | File | Trigger |
|---|---|---|
| ASH Full Repository Scan | .github/workflows/ash-full-repository-scan.yml |
Push to main, monthly schedule, manual |
| Labeler | .github/workflows/label.yml |
Auto-labels PRs by changed paths (bedrock, sagemaker, agentcore, deployment, docs) |
cfn-lint suppressions are configured in .cfnlintrc at the repository root for IAM actions not yet in cfn-lint’s database (for example, bedrock-agentcore actions).
Before pushing, run these checks locally to catch issues early:
# Install tools (first time only)
pip install ruff cfn-lint
pip install -r tests/requirements.txt
pip install "pydantic>=2.0.0"
# Python lint and format
ruff check aiml-security-assessment/functions/security/
ruff format --check aiml-security-assessment/functions/security/
# Unit tests. Run these as separate pytest sessions because multiple
# assessment packages use top-level app.py imports.
export AIML_ASSESSMENT_BUCKET_NAME=test-assessment-bucket
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=testing
export AWS_SECRET_ACCESS_KEY=testing
python -m pytest tests/ -v --tb=short
python -m pytest aiml-security-assessment/functions/security/finserv_tests/ -v --tb=short
python -m pytest tests/test_consolidate_finserv.py -v --tb=short
cd aiml-security-assessment/functions/security/generate_consolidated_report
python -m pytest test_generate_report.py -v --tb=short
cd -
# CloudFormation lint
cfn-lint deployment/*.yaml
cfn-lint aiml-security-assessment/template.yaml
cfn-lint aiml-security-assessment/template-multi-account.yaml
# SAM validate and build
cd aiml-security-assessment
sam validate --template template.yaml --lint
sam validate --template template-multi-account.yaml --lint
sam build --template template.yaml
sam build --template template-multi-account.yaml
This developer guide provides the foundation for extending the AI/ML Security Assessment Framework. As you add new AI/ML services and security checks, please update this documentation to help future contributors understand and build upon your work.