Skip to main content

Sample AWS Blockchain Node Runner app for BNB Smart Chain(BSC) Nodes

Contributed by
@StayHungryStayFoolish, @frbrkoala

BNB Smart Chain (BSC), is a blockchain that supports EVM-compatible smart contracts and protocols. It utilizes the Proof of Stake Authority(PoSA) consensus mechanism, which is a hybrid consensus mechanism based on a combination of Proof of Authority (PoA) and Delegated Proof of Stake (DPoS).

This blueprint is designed to assist in deploying a single node or a Highly Available (HA) BNB Smart Chain (BSC) Fullnode on AWS. It is intended for use in development, testing, or Proof of Concept purposes.

Overview of Deployment Architectures

Single Node setup

Single Nodes Deployment

  1. The AWS Cloud Development Kit (CDK) is used to deploy a single node. The CDK application stores assets like scripts and config files in S3 bucket to copy them to the EC2 instance when launching a BSC Node.
  2. A single RPC BSC Fullnode is deployed within in the Default VPC and continuously synchronizes with the rest of nodes on BSC Blockchain Network through Internet Gateway.
  3. The BSC node is accessed by dApps or development tools internally. JSON RPC API is not exposed to the Internet to protect the node from unauthorized access. dApps need to handle user authentication and API protection, like in this example for dApps on AWS.
  4. The BSC node send various monitoring metrics for both EC2 and BSC client to Amazon CloudWatch.

Highly Available setup

Highly Available Nodes Deployment

  1. The CDK is used to deploy highly available (HA) architecture. An S3 bucket is utilized to store User data and othether script and configuration files required when launching EC2 as the BSC Node.
  2. A set of RPC BSC Fullnodes are deployed within the Auto Scaling Group in the Default VPC continuously synchronizes with the rest of nodes on BSC Blockchain Network through Internet Gateway.
  3. The BSC nodes are accessed by dApps or development tools internally through Application Load Balancer. JSON RPC API is not exposed to the Internet to protect nodes from unauthorized access. dApps need to handle user authentication and API protection, like in this example for dApps on AWS.
  4. The BSC nodes send various monitoring metrics for both EC2 and BSC nodes to Amazon CloudWatch.

Additional materials

Well-Architected Checklist

This is the Well-Architected checklist for BSC nodes implementation of the AWS Blockchain Node Runner app. This checklist takes into account questions from the AWS Well-Architected Framework which are relevant to this workload. Please feel free to add more checks from the framework if required for your workload.

PillarControlQuestion/CheckRemarks
SecurityNetwork protectionAre there unnecessary open ports in security groups?Please note that ports 30303 (TCP/UDP) for BSC are open to public to support P2P protocols.
Traffic inspectionTraffic protection is not used in the solution. AWS Web Applications Firewall (WAF) could be implemented for traffic over HTTP(S), AWS Shield provides Distributed Denial of Service (DDoS) protection. Additional charges will apply.
Compute protectionReduce attack surfaceThis solution uses Amazon Linux2 AMI(Amazon Linux2 AMI(HVM)-Kernel 5.10). You may choose to run hardening scripts on it.
Enable people to perform actions at a distanceThis solution uses AWS Systems Manager for terminal session, not ssh ports.
Data protection at restUse encrypted Amazon Elastic Block Store (Amazon EBS) volumesThis solution uses encrypted Amazon EBS volumes.
Data protection in transitUse TLSThe AWS Application Load balancer currently uses HTTP listener. Create HTTPS listener with self signed certificate if TLS is desired.
Authorization and access controlUse instance profile with Amazon Elastic Compute Cloud (Amazon EC2) instancesThis solution uses AWS Identity and Access Management (AWS IAM) role instead of IAM user.
Following principle of least privilege accessIn all node types, root user is not used (using special user "bcuser" instead).
Application securitySecurity focused development practicescdk-nag is being used with appropriate suppressions.
Cost optimizationService selectionUse cost effective resources1/ We use Graviton-based binaries to improve costs for compute. We recommend using the m7g.4xlarge EC2 instance type to optimize computational costs. 2/ Cost-effective EBS gp3 are used instead of io2.
Cost awarenessEstimate costsSingle RPC node with m7g.4xlarge EBS gp3 volumes about 4000 GB(1000 IOPS, 700 MBps/s throughput) with On-Demand pricing will cost around US$854.54 per month in the US East (N. Virginia) region. More cost-optimal option with 3 year EC2 Instance Savings plan the cost goes down to $594.15 USD. To create your own estimate use AWS Pricing Calculator
ReliabilityResiliency implementationWithstand component failuresThis solution uses AWS Application Load Balancer with RPC nodes for high availability. Newly provisioned BSC nodes triggered by Auto Scaling get up and running in about 300 minutes.
Data backupHow is data backed up?Considering blockchain data is replicated by nodes automatically and BSC nodes sync from start within an hour, we don't use any additional mechanisms to backup the data.
Resource monitoringHow are workload resources monitored?Resources are being monitored using Amazon CloudWatch dashboards. Amazon CloudWatch custom metrics are being pushed via CloudWatch Agent.
Performance efficiencyCompute selectionHow is compute solution selected?Compute solution is selected based on best price-performance, i.e. AWS Graviton-based Amazon EC2 instances.
Storage selectionHow is storage solution selected?Storage solution is selected based on best price-performance, i.e. gp3 Amazon EBS volumes with optimal IOPS and throughput.
Architecture selectionHow is the best performance architecture selected?We used a combination of recommendations from the BSC community and our own testing.
Operational excellenceWorkload healthHow is health of workload determined?Health of workload is determined via AWS Application Load Balancer Target Group Health Checks, on port 8845.
SustainabilityHardware & servicesSelect most efficient hardware for your workloadThe solution uses Graviton-powered instances. There is a potential to use AWS Graviton-based Amazon EC2 instances which offer the best performance per watt of energy use in Amazon EC2.
Recommended Infrastructure
Usage patternIdeal configurationPrimary option on AWSConfig reference
1/ Fullnode16 vCPU, 64 GB RAM, Data volume: EBS gp3 4TB, 10K IOPS, 700 MB/s throughputm7g.4xlarge EBS gp3 volumes about 4000 GB(1000 IOPS, 700 MBps/s throughput).env-sample-full

Setup Instructions

Setup Cloud9

We will use AWS Cloud9 to execute the subsequent commands. Follow the instructions in Cloud9 Setup

Clone this repository and install dependencies

git clone https://github.com/aws-samples/aws-blockchain-node-runners.git
cd aws-blockchain-node-runners
npm install

Deploy the HA Nodes

  1. Make sure you are in the root directory of the cloned repository

  2. If you have deleted or don't have the default VPC, create default VPC

    aws ec2 create-default-vpc

    NOTE: You may see the following error if the default VPC already exists: An error occurred (DefaultVpcAlreadyExists) when calling the CreateDefaultVpc operation: A Default VPC already exists for this account in this region.. That means you can just continue with the following steps.

  3. Configure the CDK app

    Create your own copy of .env file and edit it to update with your AWS Account ID, AWS Region, and optionally the BSC SNAPSHOTS URI:

    # Make sure you are in aws-blockchain-node-runners/lib/bsc
    cd lib/bsc
    pwd
    cp ./sample-configs/.env-sample-full .env
    nano .env

    IMPORTANT:

    1. By default we use the latest Geth Fullnode snapshot from 48 Club If you want to set your own BSC_SNAPSHOTS_URI, check this GitHub: https://github.com/48Club/bsc-snapshots, and use Geth full node link.
  4. Deploy common components such as IAM role

    pwd
    # Make sure you are in aws-blockchain-node-runners/lib/bsc
    npx cdk deploy bsc-common

    IMPORTANT: All AWS CDK v2 deployments use dedicated AWS resources to hold data during deployment. Therefore, your AWS account and Region must be bootstrapped to create these resources before you can deploy. If you haven't already bootstrapped, issue the following command:

    cdk bootstrap aws://ACCOUNT-NUMBER/REGION

Option 1: Single RPC Node

  1. The inital deployment a BSC Fullnode and downloading its snapshot typically takes about 2-3 hours. The Full node uses snapshots data, and downloading and decompressing the data takes time. You can grab a cup of coffee☕️ and patiently wait during this process. After deployment, you'll need to wait for the node to synchronize with the BSC Blockchain Network (next step).

       pwd
    # Make sure you are in aws-blockchain-node-runners/lib/bsc
    npx cdk deploy bsc-single-node --json --outputs-file single-node-deploy.json
  2. After the node is initialised from the snapshot you need to wait from another half a day to a day for the inital syncronization process to complete. The time depends on how fresh the snapshot was. You can use Amazon CloudWatch to track the progress. There is a script that publishes CloudWatch metrics every 5 minutes, where you can watch sync distance for consensus client and blocks behind for execution client. When the node is fully synced those two metrics shold show 0. To see them:

    • Navigate to CloudWatch service (make sure you are in the region you have specified for AWS_REGION)
    • Open Dashboards and select bsc-single-node-<node_configuration>-<your_bsc_network>-<ec2_instance_id> from the list of dashboards.

Alternatively, you can manually check Geth Syncing Status. Run the following query from within the same VPC and against the private IP of the single RPC node you deployed:

   INSTANCE_ID=$(cat single-node-deploy.json | jq -r '..|.singleinstanceid? | select(. != null)')
NODE_INTERNAL_IP=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[*].Instances[*].PrivateIpAddress' --output text --region us-east-1)

curl http://$NODE_INTERNAL_IP:8545 -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'

It will return false if the node is in sync. If eth_syncing returns anything other than false it has not finished syncing. Generally, if syncing is still ongoing, eth_syncing will return block info that looks as follows:

{
"jsonrpc": "2.0",
"id": 1,
"result": {
"currentBlock": "0x211f0d8",
"healedBytecodeBytes": "0x0",
"healedBytecodes": "0x0",
"healedTrienodeBytes": "0x0",
"healedTrienodes": "0x0",
"healingBytecode": "0x0",
"healingTrienodes": "0x0",
"highestBlock": "0x2123bff",
"startingBlock": "0x20910d7",
"syncedAccountBytes": "0x0",
"syncedAccounts": "0x0",
"syncedBytecodeBytes": "0x0",
"syncedBytecodes": "0x0",
"syncedStorage": "0x0",
"syncedStorageBytes": "0x0"
}
}
  1. Once the initial synchronization is done, you should be able to access the RPC API of that node from within the same VPC. The RPC port is not exposed to the Internet. Run the following query against the private IP of the single RPC node you deployed:

       INSTANCE_ID=$(cat single-node-deploy.json | jq -r '..|.singleinstanceid? | select(. != null)')
    NODE_INTERNAL_IP=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[*].Instances[*].PrivateIpAddress' --output text)

    # We query token balance of one of the system contracts: https://bscscan.com/address/0x0000000000000000000000000000000000001006
    curl http://$NODE_INTERNAL_IP:8545 -X POST -H "Content-Type: application/json" \
    --data '{"method":"eth_getBalance","params":["0x0000000000000000000000000000000000001006", "latest"],"id":1,"jsonrpc":"2.0"}'

    You will get a response similar to this:

       {"jsonrpc":"2.0","id":1,"result":"0x3635c9adc5dea00000"}

Option 2: Highly Available RPC Nodes

  1. The inital deployment of a BSC Fullnode and downloading its snapshot typically takes about 2-3 hours. The Full node uses snapshots data, and downloading and decompressing the data takes time. You can grab a cup of coffee☕️ and patiently wait during this process. After deployment, you'll need to wait for your another half a day to a day for your nodes to synchronize with the BSC Blockchain Network, depending on how fresh the snapshot was.

       pwd
    # Make sure you are in aws-blockchain-node-runners/lib/bsc
    npx cdk deploy bsc-ha-nodes --json --outputs-file ha-nodes-deploy.json
  2. Give the new RPC nodes about few hours to initialize and then run the following query against the load balancer behind the RPC node created

       export RPC_ALB_URL=$(cat ha-nodes-deploy.json | jq -r '..|.alburl? | select(. != null)')
    echo $RPC_ALB_URL

    Periodically check Geth Syncing Status. Run the following query from within the same VPC and against the private IP of the load balancer fronting your nodes:

    curl http://$RPC_ALB_URL:8545 -X POST -H "Content-Type: application/json" \
    --data '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'

    It will return false if the node is in sync. If eth_syncing returns anything other than false it has not finished syncing. Generally, if syncing is still ongoing, eth_syncing will return block info that looks as follows:

    {
    "jsonrpc": "2.0",
    "id": 1,
    "result": {
    "currentBlock": "0x211f0d8",
    "healedBytecodeBytes": "0x0",
    "healedBytecodes": "0x0",
    "healedTrienodeBytes": "0x0",
    "healedTrienodes": "0x0",
    "healingBytecode": "0x0",
    "healingTrienodes": "0x0",
    "highestBlock": "0x2123bff",
    "startingBlock": "0x20910d7",
    "syncedAccountBytes": "0x0",
    "syncedAccounts": "0x0",
    "syncedBytecodeBytes": "0x0",
    "syncedBytecodes": "0x0",
    "syncedStorage": "0x0",
    "syncedStorageBytes": "0x0"
    }
    }

NOTE: By default and for security reasons the load balancer is available only from within the default VPC in the region where it is deployed. It is not available from the Internet and is not open for external connections. Before opening it up please make sure you protect your RPC APIs.

  1. Once the initial synchronization is done, you should be able to access the RPC API of that node from within the same VPC. The RPC port is not exposed to the Internet. Run the following query against the private IP of the single RPC node you deployed:

       export RPC_ALB_URL=$(cat ha-nodes-deploy.json | jq -r '..|.alburl? | select(. != null)')
    echo $RPC_ALB_URL

    # We query token balance of one of the system contracts: https://bscscan.com/address/0x0000000000000000000000000000000000001006
    curl http://$RPC_ALB_URL:8545 -X POST -H "Content-Type: application/json" \
    --data '{"method":"eth_getBalance","params":["0x0000000000000000000000000000000000001006", "latest"],"id":1,"jsonrpc":"2.0"}'

    You will get a response similar to this:

       {"jsonrpc":"2.0","id":1,"result":"0x3635c9adc5dea00000"}

Clearing up and undeploy everything

  1. Undeploy HA Nodes, Single Nodes and Common stacks
# Setting the AWS account id and region in case local .env file is lost
export AWS_ACCOUNT_ID=<your_target_AWS_account_id>
export AWS_REGION=<your_target_AWS_region>

pwd
# Make sure you are in aws-blockchain-node-runners/lib/bsc

# Undeploy Single Node
cdk destroy bsc-single-node

# Undeploy HA Nodes
cdk destroy bsc-ha-nodes

# Delete all common components like IAM role and Security Group
cdk destroy bsc-common
  1. Follow steps to delete the Cloud9 instance in Cloud9 Setup

FAQ

  1. How to check the logs of the clients running on my sync node?

Please enter the AWS Management Console - EC2 Instances, choose the correct region, copy the instance ID you need to query.

Note: In this tutorial we chose not to use SSH and use Session Manager instead. That allows you to log all sessions in AWS CloudTrail to see who logged into the server and when. If you receive an error similar to SessionManagerPlugin is not found, install Session Manager plugin for AWS CLI

pwd
# Make sure you are in aws-blockchain-node-runners/lib/bsc

export INSTANCE_ID="i-**************"
echo "INSTANCE_ID=" $INSTANCE_ID
aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION
sudo su ec2-user
sudo journalctl -o cat -fu bsc
  1. How to check the logs from the EC2 user-data script?

Please enter the AWS Management Console - EC2 Instances, choose the correct region, copy the instance ID you need to query.

pwd
# Make sure you are in aws-blockchain-node-runners/lib/bsc

export INSTANCE_ID="i-**************"
echo "INSTANCE_ID=" $INSTANCE_ID
aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION
sudo cat /var/log/cloud-init-output.log
  1. How can I check the BSC service log on EC2?

Please enter the AWS Management Console - EC2 Instances, choose the correct region, copy the instance ID you need to query.

pwd
# Make sure you are in aws-blockchain-node-runners/lib/bsc

export INSTANCE_ID="i-**************"
echo "INSTANCE_ID=" $INSTANCE_ID

aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION
cd /data
cat bsc.log
  1. How can I restart the BSC service?

Please enter the AWS Management Console - EC2 Instances, choose the correct region, copy the instance ID you need to query.

pwd
# Make sure you are in aws-blockchain-node-runners/lib/bsc

export INSTANCE_ID="i-**************"
echo "INSTANCE_ID=" $INSTANCE_ID

aws ssm start-session --target $INSTANCE_ID --region $AWS_REGION
sudo systemctl restart bsc

NOTE: You can also try the following command to obtain more information:

  • Check the BSC service status
    • sudo systemctl status bsc
  • View BSC service configuration
    • cat /etc/systemd/system/bsc.service
  1. Where can I find more infromation about BSC RPC API?

Please refer to more JSON-RPC API METHODS. The following are some commonly used API methods:

  • eth_blockNumber
  • eth_getBalance
  • eth_accounts
  • eth_call
  • eth_estimateGas
  • eth_signTransaction
  • eth_sendTransaction
  • eth_getBlockByHash
  • eth_getBlockByNumber
  • eth_getTransactionByHash

Upgrades

When nodes need to be upgraded or downgraded, use blue/green pattern to do it. This is not yet automated and contributions are welcome!