Deployment Guide

This guide covers different deployment scenarios for the Universal Blockchain Node Runner, from development to production environments.

Prerequisites
Quick Start
Deployment Modes
Deployment Scenarios
Best Practices
Post-Deployment
Maintenance
Destroying a Stack
Troubleshooting

Prerequisites

Required Tools

AWS CLI (v2.x or later)
```
aws --version
```
Install: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
Node.js (v20.x or later)
```
node --version
```
Install: https://nodejs.org/
Git
```
git --version
```

AWS Account Setup

AWS Account: Active AWS account with appropriate permissions
IAM Permissions: To perform deployment, our IAM user/role needs:
- CloudFormation full access
- EC2 full access
- IAM role creation
- S3 bucket access
- CloudWatch access
- Auto Scaling (for HA deployments)
- Elastic Load Balancing (for HA deployments)
AWS CLI Configuration:
```
aws configure
```
Provide:
- AWS Access Key ID
- AWS Secret Access Key
- Default region
- Output format (json recommended)
Verify Configuration:
```
aws sts get-caller-identity
```

Quick Start

1. Clone and Install

# Clone repository
git clone <repository-url>
cd aws-blockchain-node-runners

# Install dependencies
npm install

2. Bootstrap CDK

First-time setup in each account/region:

npx cdk bootstrap aws://ACCOUNT-ID/REGION

Example:

npx cdk bootstrap aws://123456789012/us-east-1

3. Configure Environment

# Copy sample configuration
cp node_modules/aws-bnr-blueprint-dummy/samples/.env-testnet .env

# Edit with your details
nano .env

Minimum required changes:

AWS_ACCOUNT_ID="your-account-id"
AWS_REGION="your-region"

Tip: Run aws sts get-caller-identity to confirm your account ID. The deployment region is always taken from AWS_REGION in your .env — it overrides your AWS CLI profile default, so you can deploy to any region regardless of your profile configuration.

Tip: If deployment fails because the instance type is not available in the default AZ, set AWS_AZ to a specific availability zone where your instance type is supported. For example, add AWS_AZ="us-east-1a" to your .env file. You can check which AZs support your instance type with:
aws ec2 describe-instance-type-offerings --location-type availability-zone --filters Name=instance-type,Values=<type> --region <region>

4. Deploy

# Preview changes
npx cdk synth

# Backup .env file with stack name (for future reference)
STACK_NAME=$(npx cdk synth --quiet 2>&1 | grep "Stack created:" | awk '{print $3}')
cp .env .env-${STACK_NAME}

# Deploy stack
npx cdk deploy --json --outputs-file deploy-output-${STACK_NAME}.json

# Approve changes when prompted

IMPORTANT: File Naming Convention

After deployment, you'll have two files per deployment:

.env-{stack-name} - Configuration backup (for reference)
deploy-output-{stack-name}.json - Deployment outputs (required for operations)

Examples:

.env-solana-mainnet-beta-agave-rpc-base
deploy-output-solana-mainnet-beta-agave-rpc-base.json

Why backup .env files:

Reference for what was deployed
Useful for redeployment or troubleshooting
Documents configuration decisions
Not required for healthcheck (info extracted from stack name and logs)

For multiple deployments:

# List all deployments
ls deploy-output-*.json

# List all configuration backups
ls .env-*

# Each pair corresponds to a unique deployment

Note: The stack name is automatically generated in the format ${protocol}-${network}-${clientConfig}. Version numbers, file extensions, and special characters are removed to reduce variability and allow version updates without changing the stack name.

5. Verify Deployment

# Set the deployment file (replace {stack-name} with your actual stack name)
export DEPLOY_FILE="deploy-output-{stack-name}.json"

# Get stack outputs
cat $DEPLOY_FILE | jq

# Get instance ID (single-node)
export INSTANCE_ID=$(cat $DEPLOY_FILE | jq -r '..|.InstanceId? | select(. != null)')
echo "INSTANCE_ID=$INSTANCE_ID"

# Connect to instance (single-node)
aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION

Deployment Modes

Single-Node Deployment

Use Cases:

Development and testing
Personal blockchain node
Low-traffic applications
Cost-sensitive deployments

Architecture:

┌─────────────────────────────────────┐
│           VPC (Default)             │
│  ┌───────────────────────────────┐  │
│  │      Public Subnet            │  │
│  │  ┌─────────────────────────┐  │  │
│  │  │   EC2 Instance          │  │  │
│  │  │   - Blockchain Node     │  │  │
│  │  │   - EBS Volumes         │  │  │
│  │  │   - CloudWatch Agent    │  │  │
│  │  └─────────────────────────┘  │  │
│  └───────────────────────────────┘  │
└─────────────────────────────────────┘

Configuration:

DEPLOYMENT_MODE="single-node"
INSTANCE_TYPE="m6a.2xlarge"

Characteristics:

Single point of failure
Lower cost
Simpler management
Includes CloudWatch dashboard
Direct instance access

High Availability (HA) Deployment

Use Cases:

Production workloads
High-traffic applications
Mission-critical services
Redundancy requirements

Architecture:

┌─────────────────────────────────────────────────┐
│              VPC (Default)                      │
│  ┌───────────────────────────────────────────┐  │
│  │    Application Load Balancer              │  │
│  └────────────────┬──────────────────────────┘  │
│                   │                             │
│  ┌────────────────┴──────────────────────────┐  │
│  │        Auto Scaling Group                 │  │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐ │  │
│  │  │ Node 1   │  │ Node 2   │  │ Node N   │ │  │
│  │  │ (Primary)│  │ (Replica)│  │ (Replica)│ │  │
│  │  └──────────┘  └──────────┘  └──────────┘ │  │
│  └───────────────────────────────────────────┘  │
└─────────────────────────────────────────────────┘

Configuration:

DEPLOYMENT_MODE="ha-nodes"
HA_NUMBER_OF_NODES="3"
HA_ALB_HEALTHCHECK_PORT="8545"
HA_ALB_HEALTHCHECK_PATH="/health"
HA_ALB_HEALTHCHECK_GRACE_PERIOD_MIN="60"
HA_ALB_HEALTHCHECK_INTERVAL_SEC="30"
HA_ALB_HEALTHCHECK_TIMEOUT_SEC="5"
HA_ALB_HEALTHCHECK_HEALTHY_THRESHOLD="3"
HA_ALB_HEALTHCHECK_UNHEALTHY_THRESHOLD="2"
HA_NODES_HEARTBEAT_DELAY_MIN="10"
HA_ALB_DEREGISTRATION_DELAY_SEC="30"

Characteristics:

High availability
Auto-scaling capability
Load balancing
Higher cost
No default dashboard (create custom)
Graceful node replacement

Deployment Scenarios

For specific deployment scenarios and configuration examples, refer to the protocol-specific documentation:

Dummy Protocol: See blueprints/dummy/README.md for testing and development scenarios
Future Protocols: Each protocol will include deployment scenarios in its README

Sample configurations for each protocol are available in the blueprint package's samples/ directory at node_modules/aws-bnr-blueprint-{protocol}/samples/.

Best Practices

Security

Use IAM Roles: Never use long-term credentials

# Attach role to EC2 instances (done automatically)
# Use AWS Systems Manager Session Manager for access

Secrets Management: Store sensitive data in AWS Secrets Manager

# Create secret
aws secretsmanager create-secret \
  --name my-protocol-secret \
  --secret-string '{"key":"value"}'

# Reference in .env
PROTOCOL_SECRET_ARN="arn:aws:secretsmanager:..."

Network Security: Minimize exposed ports
- Only open required ports in security groups
- Use private subnets for production (requires VPC configuration)
- Enable VPC Flow Logs
Default network placement (by design): Single-node instances and HA Auto Scaling Group instances are deployed into public subnets of the default VPC and receive public IPs. This is intentional — blockchain nodes need direct inbound P2P connectivity, and a public-subnet layout avoids NAT gateway cost/complexity. The security posture relies on the security group:
- P2P ports are intentionally open to 0.0.0.0/0 (required for peer discovery).
- RPC / WebSocket / metrics ports are marked public: false and are restricted to the VPC CIDR — they are not internet-reachable by default (in HA mode this is further governed by HA_ALB_INTERNET_FACING / HA_ALB_ALLOWED_CIDR, which default to internal/VPC-only).
- Egress is effectively unrestricted (all TCP/UDP to 0.0.0.0/0) because nodes must reach arbitrary peers across the internet.
If you require defense-in-depth beyond the security group (e.g. instances in private subnets with a NAT gateway for egress, P2P via an EIP/NAT), deploy into a custom VPC configured that way rather than the default VPC.
Encryption: Enable encryption at rest
- EBS volumes encrypted by default
- Use KMS for additional control

Performance

Right-Size Instances: Start with recommended types

# Check protocol's package.json for recommendations
cat node_modules/aws-bnr-blueprint-{protocol}/package.json | jq '."aws-blockchain-node-runner".defaultInstanceTypes'

Optimize Storage:
- Use gp3 for cost-effective performance
- Use io2 for high performance, but only if you require persistance
- Use Instance Store if you need high performance and can tolerate ephemeral nature of it
- Monitor IOPS and throughput metrics
Enable Snapshots: Significantly reduces sync time
```
SNAPSHOT_ENABLED="true"
SNAPSHOT_DOWNLOAD_URL="https://..."
```
Large Snapshots: If the compressed archive plus extracted data exceeds available disk space (common with multi-TB snapshots on instance-store volumes), configure a staging volume to hold the archive during download:
```
SNAPSHOT_STAGING_VOL_SIZE="5000"  # Size in GiB, ~1.1x compressed archive size
```
This creates a temporary gp3 EBS volume that is automatically deleted after extraction. See Snapshot Staging Guide for volume sizing guidance and cost analysis.
Enable Traffic Shaping (RPC nodes only): Reduces data transfer costs by up to 85%
```
TRAFFIC_SHAPING_ENABLED="true"
TRAFFIC_SHAPING_RATE_MBIT="40"
TRAFFIC_SHAPING_CHECK_INTERVAL_SEC="60"
TRAFFIC_SHAPING_MAX_BLOCKS_BEHIND="10"
```
Important: Only use on RPC nodes. Do not use on validator/consensus nodes. See Traffic Shaping Guide for detailed information and cost analysis.
Monitor Performance: Use CloudWatch metrics
- CPU utilization
- Disk I/O
- Network throughput
- Protocol-specific metrics
- Traffic shaping metrics (if enabled): c1_blocks_behind

Cost Optimization

Use Appropriate Instance Types:
- Development: t3.medium, t3.large
- Production: m6a.2xlarge, m6a.4xlarge
- High-performance: i4i.2xlarge, i4i.4xlarge
Optimize Storage:
- Use gp3 instead of io1/io2 when possible
- Right-size IOPS (don't over-provision)

Use ARM Instances: Often 20% cheaper

INSTANCE_TYPE="m6g.2xlarge"
CPU_TYPE="ARM_64"

Schedule Non-Production: Stop instances when not needed
```
# Use AWS Instance Scheduler or Lambda
```

Monitor Costs: Set up billing alerts

aws budgets create-budget \
  --account-id 123456789012 \
  --budget file://budget.json

Reliability

Use HA Mode for Production:

DEPLOYMENT_MODE="ha-nodes"
HA_NUMBER_OF_NODES="3"

Configure Health Checks Properly:
- Appropriate grace period for node initialization
- Reasonable interval and timeout
- Correct health check endpoint
Set Up Monitoring:
- CloudWatch dashboards
- CloudWatch alarms
- SNS notifications
Implement Backup Strategy:
- Keep .env configuration files backed up
- Document deployment settings
- Use blockchain snapshot downloads for data recovery
Plan for Updates:
- Test updates on testnet first
- Use rolling updates for HA deployments
- Have rollback plan

Post-Deployment

Verify Deployment

Check Stack Status:

aws cloudformation describe-stacks \
  --stack-name YourStackName \
  --query 'Stacks[0].StackStatus'

Get Outputs:

aws cloudformation describe-stacks \
  --stack-name YourStackName \
  --query 'Stacks[0].Outputs'

Connect to Instance (single-node):

# Get instance ID from outputs
export INSTANCE_ID=$(cat deploy-output.json | jq -r '..|.InstanceId? | select(. != null)')
echo "INSTANCE_ID=$INSTANCE_ID"

aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION

Check Node Status:

Option 1: View logs in CloudWatch (recommended):

# View node service logs
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow --filter-pattern "node.service"

# View for specific instance
export INSTANCE_ID=$(cat deploy-output.json | jq -r '..|.InstanceId? | select(. != null)')
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow --log-stream-names $INSTANCE_ID --filter-pattern "node.service"

Option 2: Connect via SSM:

# Check service status
sudo systemctl status node

# View logs directly
sudo journalctl -u node -f

Test RPC Endpoint:

Note: By default, security groups restrict RPC access to within the VPC IP range. To test the endpoint:

a. From within the VPC (recommended - via SSM Session Manager):

# Get instance ID from deploy outputs
export INSTANCE_ID=$(cat deploy-output.json | jq -r '..|.InstanceId? | select(. != null)')

# Connect to instance
aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION

# Test locally
curl http://localhost:8545

b. From outside the VPC (requires security group modification):

# Temporarily add your IP to security group
aws ec2 authorize-security-group-ingress \
  --group-id sg-xxxxx \
  --protocol tcp \
  --port 8545 \
  --cidr your-ip/32

# Test from your machine
curl http://instance-ip:8545  # Single-node
curl http://alb-dns-name:8545  # HA

# Remove the rule after testing
aws ec2 revoke-security-group-ingress \
  --group-id sg-xxxxx \
  --protocol tcp \
  --port 8545 \
  --cidr your-ip/32

Configure Monitoring

View CloudWatch Logs:

Cloud-init output (deployment logs):

# View deployment logs
aws logs tail /aws/ec2/blockchain-nodes/cloud-init-output --follow

# View for specific instance
export INSTANCE_ID=$(cat deploy-output.json | jq -r '..|.InstanceId? | select(. != null)')
aws logs tail /aws/ec2/blockchain-nodes/cloud-init-output --follow --log-stream-names $INSTANCE_ID

Systemd service logs (node.service, syncchecker.service, net-rules.service):

Note: Ubuntu's rsyslog automatically forwards all systemd service logs to /var/log/syslog, which is collected by CloudWatch agent.

# View all systemd service logs
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow

# View specific service logs for specific instance
export INSTANCE_ID=$(cat deploy-output.json | jq -r '..|.InstanceId? | select(. != null)')
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow --log-stream-names $INSTANCE_ID --filter-pattern "node.service"
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow --log-stream-names $INSTANCE_ID --filter-pattern "syncchecker.service"
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow --log-stream-names $INSTANCE_ID --filter-pattern "net-rules.service"

# View all logs for specific instance (no service filter)
aws logs tail /aws/ec2/blockchain-nodes/systemd-services --follow --log-stream-names $INSTANCE_ID

Note: All systemd service logs are available in CloudWatch Logs. You can also connect via SSM and use journalctl if needed.

Set Up Alarms:

aws cloudwatch put-metric-alarm \
  --alarm-name high-cpu \
  --alarm-description "Alert when CPU exceeds 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2

Maintenance

Updates

Update Node Version (requires stack replacement):
```
# Update .env
CLIENT_VERSION="v1.15.0"

# Destroy existing stack
npx cdk destroy

# Deploy new stack with updated version
npx cdk deploy --json --outputs-file deploy-output.json
```
Note: Version updates require instance replacement. For single-node deployments, this causes downtime. For HA deployments, use rolling updates (see below).
Update Configuration (non-instance changes):
```
# Modify .env (e.g., HA health check settings)
# Deploy changes
npx cdk deploy --json --outputs-file deploy-output.json
```
Note: Some configuration changes (like health check settings) can be updated without destroying the stack. Instance-level changes require replacement.
Rolling Updates (HA only):
- For HA deployments, instance replacements happen automatically as rolling updates
- New instances launched with updated configuration
- Health checks verify new instances are healthy
- Old instances terminated after deregistration delay
- No downtime during the update process

Scaling

Vertical Scaling (change instance type - requires stack replacement):
```
# Update .env
INSTANCE_TYPE="m6a.4xlarge"

# Destroy existing stack
npx cdk destroy

# Deploy with new instance type
npx cdk deploy --json --outputs-file deploy-output.json
```
Note: Changing instance type requires instance replacement. For single-node, this causes downtime. For HA, rolling updates minimize downtime.
Horizontal Scaling (HA only - no downtime):
```
# Update .env
HA_NUMBER_OF_NODES="5"

# Deploy (no destroy needed)
npx cdk deploy --json --outputs-file deploy-output.json
```
Note: Horizontal scaling in HA mode does not require stack destruction and causes no downtime.

Storage Scaling (live volume expansion):

# Increase volume size (can be done live)
aws ec2 modify-volume --volume-id vol-xxxxx --size 8000

# Wait for modification to complete
aws ec2 describe-volumes-modifications --volume-id vol-xxxxx

# Connect to instance and extend filesystem
export INSTANCE_ID=$(cat deploy-output.json | jq -r '..|.InstanceId? | select(. != null)')
aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION

sudo resize2fs /dev/xvdg  # For ext4
# OR
sudo xfs_growfs /data  # For xfs

Note: Storage can be expanded without destroying the stack or replacing instances.

Backup and Recovery

Note: EBS snapshots are not recommended for blockchain nodes due to the large data size and slow lazy-loading performance. Instead, use blockchain-specific snapshot downloads from external sources (configured via SNAPSHOT_DOWNLOAD_URL).

For disaster recovery:

Re-deploy from Configuration: Keep your .env file backed up
Use Blockchain Snapshots: Download fresh blockchain data from trusted snapshot providers
Document Configuration: Maintain documentation of your deployment settings

Monitoring and Alerting

Regular Health Checks:
- Review CloudWatch dashboards daily
- Check alarm status
- Review logs for errors
Performance Monitoring:
- Track sync status
- Monitor resource utilization
- Identify bottlenecks
Cost Monitoring:
- Review AWS Cost Explorer
- Check for unexpected charges
- Optimize resource usage

Destroying a Stack

To remove a deployed node and all associated AWS resources:

npx cdk destroy <stack-name>

The AI-driven workflow (@deploy) covers teardown as part of the session. Use the command above if you've exited the AI session and want to clean up manually.

Troubleshooting

See Troubleshooting Guide for detailed troubleshooting steps.

Deployment Guide

Table of Contents

Prerequisites

Required Tools

AWS Account Setup

Quick Start

1. Clone and Install

2. Bootstrap CDK

3. Configure Environment

4. Deploy

5. Verify Deployment

Deployment Modes

Single-Node Deployment

High Availability (HA) Deployment

Deployment Scenarios

Best Practices

Security

Performance

Cost Optimization

Reliability

Post-Deployment

Verify Deployment

Configure Monitoring

Maintenance

Updates

Scaling

Backup and Recovery

Monitoring and Alerting

Destroying a Stack

Troubleshooting

See Also

Table of Contents​

Prerequisites​

Required Tools​

AWS Account Setup​

Quick Start​

1. Clone and Install​

2. Bootstrap CDK​

3. Configure Environment​

4. Deploy​

5. Verify Deployment​

Deployment Modes​

Single-Node Deployment​

High Availability (HA) Deployment​

Deployment Scenarios​

Best Practices​

Security​

Performance​

Cost Optimization​

Reliability​

Post-Deployment​

Verify Deployment​

Configure Monitoring​

Maintenance​

Updates​

Scaling​

Backup and Recovery​

Monitoring and Alerting​

Destroying a Stack​

Troubleshooting​

See Also​

Table of Contents

Prerequisites

Required Tools

AWS Account Setup

Quick Start

1. Clone and Install

2. Bootstrap CDK

3. Configure Environment

4. Deploy

5. Verify Deployment

Deployment Modes

Single-Node Deployment

High Availability (HA) Deployment

Deployment Scenarios

Best Practices

Security

Performance

Cost Optimization

Reliability

Post-Deployment

Verify Deployment

Configure Monitoring

Maintenance

Updates

Scaling

Backup and Recovery

Monitoring and Alerting

Destroying a Stack

Troubleshooting

See Also