sample-aiml-security-assessment

AI/ML Security Assessment Framework - Developer Guide

Table of Contents


Architecture Overview

The AI/ML Security Assessment Framework is a serverless, multi-account security assessment solution for AWS AI/ML workloads. It performs 52 security checks across Amazon Bedrock, Amazon SageMaker AI, and Amazon Bedrock AgentCore, generating interactive HTML reports with findings and remediation guidance.

Security Design Principles

Architecture Diagrams

Phase 1: Deployment Setup (AWS CloudFormation)

Deployment Phase

Phase 2: Assessment Execution (AWS CodeBuild)

Execution Phase

Service-Level Assessment Architecture

Service-Level Architecture

Two-Phase Architecture

Phase 1: Infrastructure Deployment

Step 1: Member Account Roles (1-aiml-security-member-roles.yaml)

Step 2: Central Infrastructure (2-aiml-security-codebuild.yaml)

Phase 2: Assessment Execution (AWS CodeBuild Orchestration)

AWS CodeBuild Execution Flow

  1. Account Discovery: Lists active accounts from AWS Organizations
  2. Role Assumption: Assumes AIMLSecurityMemberRole in each target account
  3. AWS SAM Deployment: Deploys the AI/ML assessment stack through AWS SAM
  4. Assessment Execution: Triggers AWS Step Functions workflow in each account
  5. Results Consolidation: Collects and consolidates results from all accounts

Project Structure

sample-aiml-security-assessment/
├── aiml-security-assessment/
│   ├── functions/security/
│   │   ├── bedrock_assessments/      # Bedrock security checks (14)
│   │   ├── sagemaker_assessments/    # SageMaker security checks (25)
│   │   ├── agentcore_assessments/    # AgentCore security checks (13)
│   │   ├── iam_permission_caching/   # AWS IAM permissions cache
│   │   ├── cleanup_bucket/           # Amazon S3 cleanup
│   │   └── generate_consolidated_report/  # HTML/CSV report generation
│   ├── statemachine/                 # AWS Step Functions definition
│   ├── images/                       # SAM application images
│   ├── template.yaml                 # AWS SAM template (single-account)
│   ├── template-multi-account.yaml   # AWS SAM template (multi-account)
│   ├── samconfig.toml                # SAM deployment configuration
│   ├── envvars.json                  # Environment variables for local testing
│   └── testfile.json                 # Test event file for local invocation
├── deployment/                       # AWS CloudFormation templates
├── docs/                             # Documentation
│   ├── DEVELOPER_GUIDE.md            # This guide
│   ├── SECURITY_CHECKS.md            # Security checks reference
│   ├── TROUBLESHOOTING.md            # Troubleshooting guide
│   ├── diagrams/                     # Architecture diagrams
│   └── icons/                        # AWS service icons
├── sample-reports/                   # Sample assessment reports
│   ├── scripts/                      # Screenshot capture scripts
│   ├── *.html                        # Sample HTML reports
│   └── *.png                         # Report screenshots
├── buildspec.yml                     # AWS CodeBuild orchestration
├── buildspec-modular-example.yml     # Modular buildspec example
└── consolidate_html_reports.py       # Multi-account report consolidation

Member Account Resources (Deployed by AWS SAM)

Assessment Execution Workflow

AWS CodeBuild Orchestration

# buildspec.yml execution flow
1. Get active accounts from AWS Organizations
2. For each account:
   - Assume AIMLSecurityMemberRole
   - Deploy AI/ML assessment stack through AWS SAM
   - Start AWS Step Functions execution
3. Wait for completion and consolidate results

AWS Step Functions (Per Module)

{
  "Comment": "AI/ML Assessment Module",
  "StartAt": "Cleanup Amazon S3 Bucket",
  "States": {
    "Cleanup Amazon S3 Bucket": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Next": "AWS IAM Permission Caching"
    },
    "AWS IAM Permission Caching": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Next": "Parallel Service Assessments"
    },
    "Parallel Service Assessments": {
      "Type": "Parallel",
      "Branches": [
        {"StartAt": "Amazon Bedrock Assessment", "States": {...}},
        {"StartAt": "Amazon SageMaker AI Assessment", "States": {...}},
        {"StartAt": "Amazon Bedrock AgentCore Assessment", "States": {...}}
      ],
      "Next": "Generate Consolidated Report"
    },
    "Generate Consolidated Report": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "End": true
    }
  }
}

Assessment Structure

The framework includes 52 security checks across three AI/ML services. For the complete list of checks with descriptions, see the Security Checks Reference.

AWS Lambda Functions

Each assessment AWS Lambda function:

  1. Receives execution context from AWS Step Functions
  2. Reads cached AWS IAM permissions from Amazon S3
  3. Performs security checks against AWS APIs
  4. Generates CSV report with findings
  5. Uploads results to Amazon S3
  6. Returns findings summary to AWS Step Functions

Additional Functions:

Adding New AI/ML Service Assessments

To add a new AI/ML service (for example, Amazon Comprehend, Amazon Textract):

Step 1: Create Service Assessment Function

  1. Create Function Directory (One function per service):
    # Example: Adding Comprehend security assessment
    mkdir -p aiml-security-assessment/functions/security/comprehend_assessments
    cd aiml-security-assessment/functions/security/comprehend_assessments
    
  2. Create Function Files: ```python

    app.py

    import boto3 import json from schema import create_finding

def lambda_handler(event, context): “"”Main assessment handler for new service””” all_findings = []

# Get cached permissions
execution_id = event["Execution"]["Name"]
permission_cache = get_permissions_cache(execution_id)

# Run assessment checks
findings = check_new_service_security(permission_cache)
all_findings.append(findings)

# Generate and upload report
csv_content = generate_csv_report(all_findings)
bucket_name = os.environ.get("AIML_ASSESSMENT_BUCKET_NAME")
s3_url = write_to_s3(execution_id, csv_content, bucket_name)

return {
    "statusCode": 200,
    "body": {
        "message": "New service assessment completed",
        "findings": all_findings,
        "report_url": s3_url,
    },
}

def check_new_service_security(permission_cache): “"”Implement your security checks here””” findings = { “check_name”: “New Service Security Check”, “status”: “PASS”, “details”: “”, “csv_data”: [], }

# Your assessment logic here
# Use permission_cache to check IAM permissions
# Use AWS SDK to check service configurations

return findings ```
  1. Create Requirements File:
    # requirements.txt
    boto3>=1.26.0
    botocore>=1.29.0
    
  2. Create Schema File: ```python

    schema.py

    from enum import Enum

class SeverityEnum(str, Enum): HIGH = “High” MEDIUM = “Medium” LOW = “Low” INFORMATIONAL = “Informational” NA = “N/A”

class StatusEnum(str, Enum): FAILED = “Failed” PASSED = “Passed” NA = “N/A”

def create_finding( check_id, finding_name, finding_details, resolution, reference, severity, status ): “"”Create standardized finding format

Args:
    check_id: Unique check identifier (for example, SM-01, BR-01, AC-01)
    finding_name: Name of the finding
    finding_details: Detailed description
    resolution: Steps to resolve (empty string for N/A status)
    reference: Documentation URL
    severity: SeverityEnum value
    status: StatusEnum value (Failed, Passed, or N/A)
"""
return {
    "Check_ID": check_id,
    "Finding": finding_name,
    "Finding_Details": finding_details,
    "Resolution": resolution,
    "Reference": reference,
    "Severity": severity,
    "Status": status,
} ```

Step 2: Update AWS SAM Template

Add your new function to aiml-security-assessment/template.yaml:

  ComprehendSecurityAssessmentFunction:
    Type: AWS::Serverless::Function
    Properties:
      FunctionName: !Sub 'ComprehendSecurityAssessment-${AWS::AccountId}'
      CodeUri: functions/security/comprehend_assessments/
      Handler: app.lambda_handler
      Runtime: python3.12
      Timeout: 600
      MemorySize: 1024
      Environment:
        Variables:
          AIML_ASSESSMENT_BUCKET_NAME: !Ref AIMLAssessmentBucket
      Policies:
        - S3CrudPolicy:
            BucketName: !Ref AIMLAssessmentBucket
        - Statement:
            - Sid: ComprehendReadPermissions
              Effect: Allow
              Action:
                - comprehend:List*
                - comprehend:Describe*
                - comprehend:Get*
              Resource: '*'

Step 3: Update AWS Step Functions Definition

Add new service to the parallel execution in aiml-security-assessment/statemachine/assessments.asl.json:

{
  "Parallel Service Assessments": {
    "Type": "Parallel",
    "Branches": [
      {
        "StartAt": "Bedrock Security Assessment",
        "States": {"Bedrock Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
      },
      {
        "StartAt": "SageMaker Security Assessment",
        "States": {"SageMaker Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
      },
      {
        "StartAt": "AgentCore Security Assessment",
        "States": {"AgentCore Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
      },
      {
        "StartAt": "Comprehend Security Assessment",
        "States": {"Comprehend Security Assessment": {"Type": "Task", "Resource": "arn:aws:states:::lambda:invoke", "End": true}}
      }
    ]
  }
}

Step 4: Update AWS IAM Permissions

Add required permissions to member role template:

In deployment/1-aiml-security-member-roles.yaml:

- Effect: Allow
  Action:
    - comprehend:List*
    - comprehend:Describe*
    - comprehend:Get*
  Resource: '*'

In deployment/aiml-security-single-account.yaml (for single account mode):

- comprehend:List*
- comprehend:Describe*
- comprehend:Get*

Step 5: Test Locally

Test your new assessment function locally:

cd aiml-security-assessment
sam build
sam local invoke ComprehendSecurityAssessmentFunction --event testfile.json

Assessment Best Practices

1. Security Check Implementation

2. Performance Optimization

3. Error Handling

try:
    # Assessment logic
    result = aws_client.describe_service()
except ClientError as e:
    if e.response["Error"]["Code"] == "AccessDenied":
        # Handle permission issues
        logger.warning(f"Access denied for service check: {str(e)}")
        return create_finding(
            finding_name="Permission Check",
            finding_details="Insufficient permissions to assess service",
            resolution="Grant required permissions to assessment role",
            reference="https://docs.aws.amazon.com/service/permissions",
            severity="High",
            status="Failed",
        )
    else:
        # Handle other AWS errors
        logger.error(f"AWS API error: {str(e)}")
        raise
except Exception as e:
    # Handle unexpected errors
    logger.error(f"Unexpected error: {str(e)}", exc_info=True)
    raise

Testing Your Extensions

1. Local Testing

# Test individual function
cd aiml-security-assessment
sam build
sam local invoke NewServiceSecurityAssessmentFunction --event test-event.json

2. Integration Testing

# Deploy to test account
sam deploy --stack-name aiml-security-test --capabilities CAPABILITY_IAM

# Execute AWS Step Functions
aws stepfunctions start-execution \
  --state-machine-arn arn:aws:states:region:account:stateMachine:TestStateMachine \
  --input '{"accountId":"123456789012"}'

3. Multi-Account Testing

  1. Deploy member roles to test accounts using AWS CloudFormation StackSets
  2. Deploy central infrastructure with test parameters
  3. Monitor AWS CodeBuild logs for deployment and execution status
  4. Verify results in central Amazon S3 bucket

Monitoring and Debugging

For detailed troubleshooting guidance, common issues, and debugging tips, see the Troubleshooting Guide.

Development Roadmap

Current Status

Potential Additions

Development Pattern

Report Generation Architecture

Shared Template Module

Report generation uses a single shared template (report_template.py) for both deployment modes:

aiml-security-assessment/functions/security/generate_consolidated_report/
├── app.py              # Lambda handler (single-account)
├── report_template.py  # Shared HTML/CSS/JS template
└── ...

consolidate_html_reports.py  # CodeBuild script (multi-account)

How It Works

Component Mode Description
app.py (AWS Lambda) mode='single' Generates per-account HTML reports during AWS Step Functions execution
consolidate_html_reports.py mode='multi' Consolidates all account reports in AWS CodeBuild post-build phase

Both call generate_html_report() from report_template.py with different parameters.

Modifying the Report Template

To update report styling, layout, or features:

  1. Edit report_template.py only - changes apply to both single and multi-account reports
  2. Run tests: python test_generate_report.py
  3. Key functions:
    • get_html_template() - HTML/CSS/JS structure
    • generate_table_rows() - Finding row generation
    • generate_html_report() - Main entry point with mode parameter (‘single’ or ‘multi’)

Documentation and Screenshots

Updating Sample Reports

When you modify the report template or add new features, update the sample reports and screenshots:

1. Generate New Sample Reports

After making changes to report_template.py, regenerate sample reports:

# Single-account mode
python test_generate_report.py --mode single --output sample-reports/security_assessment_single_account.html

# Multi-account mode
python test_generate_report.py --mode multi --output sample-reports/security_assessment_multi_account.html

2. Capture Screenshots

The repository includes an automated screenshot capture tool:

# Activate virtual environment
source .venv/bin/activate

# Install dependencies (first time only)
pip install -r sample-reports/dev-requirements.txt
playwright install chromium

# Capture and optimize screenshots
python sample-reports/scripts/capture_screenshots.py

What the script does:

What gets generated:

The script captures 4 screenshots:

All screenshots are automatically optimized (target: 200-300KB each, ~600KB total).

Customization:

Edit sample-reports/scripts/capture_screenshots.py to customize:

# Viewport size
VIEWPORT_WIDTH = 1440
VIEWPORT_HEIGHT = 900

# Image quality
JPEG_QUALITY = 85  # Range: 1-100
PNG_OPTIMIZE = True

# Add new screenshots to SCREENSHOTS list
SCREENSHOTS = [
    {
        "name": "my-screenshot",
        "file": "security_assessment_single_account.html",
        "description": "My Custom View",
        "actions": [
            {"type": "wait", "selector": ".element", "timeout": 2000},
            {"type": "click", "selector": ".button"},
            {"type": "scroll", "position": 500},
        ],
        "clip": {"x": 0, "y": 0, "width": 1440, "height": 800},
    }
]

Available action types:

Troubleshooting:

Issue Solution
playwright not installed pip install playwright && playwright install chromium
Sample reports not found Run from repository root
Screenshots too large Lower JPEG_QUALITY or reduce viewport size
Browser launch fails Run playwright install-deps (Linux only)

3. Update README

After generating new screenshots, update the README to reference them:

### Sample Assessment Reports

**Preview:**

![Executive Dashboard](sample-reports/dashboard-overview-light.png)
*Executive summary with severity counts and service breakdown*

![Findings Table](sample-reports/findings-table.png)
*Interactive findings table with filtering capabilities*

Documentation Best Practices

CI/CD Workflows

GitHub Actions workflows run automatically to validate code quality and security on every pull request.

PR Checks

Workflow File What It Checks
Python Code Quality .github/workflows/python-lint.yml ruff check (lint) and ruff format --check (formatting) on changed .py files
CloudFormation Lint .github/workflows/cfn-lint.yml Validates deployment and SAM templates with cfn-lint
SAM Validate & Build .github/workflows/sam-validate.yml Runs sam validate --lint and sam build on SAM templates
ASH Security Scan .github/workflows/ash-security-scan.yml Scans changed files for secrets, dependency vulnerabilities, and IaC misconfigurations

Additional workflows run post-merge or on schedule:

Workflow File Trigger
ASH Full Repository Scan .github/workflows/ash-full-repository-scan.yml Push to main, monthly schedule, manual
Labeler .github/workflows/label.yml Auto-labels PRs by changed paths (bedrock, sagemaker, agentcore, deployment, docs)

cfn-lint suppressions are configured in .cfnlintrc at the repository root for IAM actions not yet in cfn-lint’s database (for example, bedrock-agentcore actions).

Running Checks Locally

Before pushing, run these checks locally to catch issues early:

# Install tools (first time only)
pip install ruff cfn-lint pytest boto3 pydantic

# Python lint and format
ruff check aiml-security-assessment/functions/security/
ruff format --check aiml-security-assessment/functions/security/

# Unit tests (213 tests, ~5 seconds, no AWS credentials needed)
python -m pytest tests/ -v

# CloudFormation lint
cfn-lint deployment/*.yaml
cfn-lint aiml-security-assessment/template.yaml
cfn-lint aiml-security-assessment/template-multi-account.yaml

# SAM validate and build
cd aiml-security-assessment
sam validate --template template.yaml --lint
sam build --template template.yaml

Support and Resources

Documentation


This developer guide provides the foundation for extending the AI/ML Security Assessment Framework. As you add new AI/ML services and security checks, please update this documentation to help future contributors understand and build upon your work.