Network architecture
Network Architecture
Section titled “Network Architecture”This document describes the network isolation layer for the AgentCore Runtime.
VPC Layout
Section titled “VPC Layout”The Runtime runs inside a VPC with 2 Availability Zones:
┌─────────────────── VPC (10.0.0.0/16) ───────────────────┐│ ││ ┌─ Public Subnets ──┐ ┌─ Private Subnets ─────────┐ ││ │ NAT Gateway ──────┼─→ │ AgentCore Runtime (ENIs) │ ││ │ (→ IGW → GitHub) │ │ SG: egress 443 only │ ││ └────────────────────┘ └───────────────────────────┘ ││ ││ VPC Endpoints: S3, DynamoDB (gw), ECR API, ECR Docker, ││ CloudWatch Logs, Secrets Manager, ││ Bedrock Runtime, STS, X-Ray (interface) │└───────────────────────────────────────────────────────────┘
Outside VPC: Orchestrator Lambda, API Lambdas, API Gateway- Public subnets — Host the NAT Gateway and Internet Gateway. No compute resources.
- Private subnets (with egress) — Host the AgentCore Runtime ENIs. All outbound traffic goes through VPC endpoints or the NAT Gateway.
- Single NAT Gateway — Provides internet egress (HTTPS only) for GitHub API access. Deployed in one AZ to minimize cost.
Egress Policy
Section titled “Egress Policy”The Runtime security group enforces HTTPS-only egress (TCP 443 to 0.0.0.0/0):
- AWS services — Reached via VPC endpoints (no NAT traffic, no internet traversal).
- GitHub API — Reached via NAT Gateway → Internet Gateway → github.com:443.
- All other traffic — Blocked by default (no other egress rules).
Domain-level filtering is enforced by the DNS Firewall (see below).
VPC Endpoints
Section titled “VPC Endpoints”| Endpoint | Type | Purpose |
|---|---|---|
| S3 | Gateway | ECR image layers, artifact storage |
| DynamoDB | Gateway | Task state tables |
| ECR API | Interface | Container image metadata |
| ECR Docker | Interface | Container image pull |
| CloudWatch Logs | Interface | Runtime application and flow logs |
| Secrets Manager | Interface | GitHub token retrieval |
| Bedrock Runtime | Interface | Model invocation |
| STS | Interface | Temporary credential retrieval for AWS SDK calls |
| X-Ray | Interface | Distributed tracing via OpenTelemetry/ADOT |
Gateway endpoints are free. Interface endpoints have per-hour and per-GB costs.
Flow Logs
Section titled “Flow Logs”VPC flow logs are enabled for all traffic (ACCEPT + REJECT) and sent to CloudWatch Logs with 30-day retention. This satisfies the AwsSolutions-VPC7 cdk-nag rule and provides audit visibility into network activity.
What is NOT in the VPC
Section titled “What is NOT in the VPC”The following resources remain outside the VPC (public Lambda execution):
- Orchestrator Lambda — Invokes the AgentCore Runtime API (not the Runtime itself).
- API handler Lambdas — Serve the REST API behind API Gateway.
- API Gateway — Public-facing REST API with Cognito auth.
These do not need VPC access and would incur unnecessary cold-start latency and ENI costs if placed in a VPC.
DNS Firewall
Section titled “DNS Firewall”Route 53 Resolver DNS Firewall provides domain-level egress filtering for the agent VPC. Only domains on the allowlist can be resolved; all other DNS queries are logged (observation mode) or blocked (enforcement mode).
How it works
Section titled “How it works”The DNS Firewall evaluates DNS queries at the VPC Resolver level using a rule group with three rules, evaluated in priority order:
- Priority 100 — ALLOW platform baseline domains. Always-allowed domains required for core agent operations: GitHub (
github.com,api.github.com,*.githubusercontent.com), npm (registry.npmjs.org,*.npmjs.org), PyPI (pypi.org,*.pypi.org,files.pythonhosted.org), and AWS services (*.amazonaws.com). - Priority 200 — ALLOW additional domains. Aggregated from Blueprint
networking.egressAllowlistvalues. Empty by default. - Priority 300 — ALERT or BLOCK all other domains. In observation mode (default), non-allowlisted queries are logged with an ALERT action. In enforcement mode, they are blocked with a NODATA response.
Observation vs enforcement mode
Section titled “Observation vs enforcement mode”The construct deploys in observation mode by default (observationMode: true). In this mode, DNS Firewall logs all queries but does not block anything, allowing safe analysis of real traffic before switching to enforcement.
Rollout process:
- Deploy with
observationMode: true— DNS queries are logged (ALERT) but not blocked. - Analyze CloudWatch DNS query logs over 1-2 weeks of real usage.
- Add any missing domains to the platform baseline or Blueprint
egressAllowlist. - Switch to
observationMode: false— non-allowlisted domains are blocked (NODATA).
Query logging
Section titled “Query logging”DNS query logs are sent to a dedicated CloudWatch Logs log group with 30-day retention. Logs capture every DNS query from the VPC, including the queried domain, source IP, and the firewall action taken (ALLOW, ALERT, or BLOCK).
Fail-open mode
Section titled “Fail-open mode”The DNS Firewall is configured with FirewallFailOpen: ENABLED. If the DNS Firewall service experiences a transient issue, DNS queries are allowed through rather than blocked. This prevents a DNS Firewall outage from killing running agent sessions (which can last up to 8 hours).
Per-repo egressAllowlist
Section titled “Per-repo egressAllowlist”The Blueprint construct supports a networking.egressAllowlist prop:
new Blueprint(this, 'MyRepoBlueprint', { repo: 'org/my-repo', repoTable: repoTable.table, networking: { egressAllowlist: ['npm.internal.example.com', '*.private-registry.io'], },});Important: Per-repo egressAllowlist values are aggregated into the platform-wide DNS Firewall policy. They document intent and feed the allowlist, but they do not provide per-session isolation. All agent sessions in the VPC share the same DNS Firewall rules.
Limitations
Section titled “Limitations”- VPC-wide policy, not per-session — All agent sessions share one VPC and DNS Firewall rule group. AgentCore Runtime has no per-session network configuration. Per-repo
egressAllowlistentries are union-ed into the platform allowlist. - DNS-only — DNS Firewall intercepts DNS queries. A direct connection to an IP address (e.g.
curl https://1.2.3.4/) bypasses DNS and is not blocked. This is acceptable for the “confused agent” threat model (the agent uses domain names) but not for a sophisticated adversary. - Wildcard scope —
*.amazonaws.comis broad but necessary for VPC endpoint private DNS. GitHub wildcards (*.githubusercontent.com) include GitHub Pages, which is a potential exfiltration vector. Narrowing may be considered after analyzing query logs. - Missing ecosystems — The platform baseline covers npm and PyPI. Go (
proxy.golang.org), Rust (crates.io,static.crates.io), and OS packages (dl-cdn.alpinelinux.org) may need to be added based on observation mode logs.
Cost Impact
Section titled “Cost Impact”Estimated monthly cost of the network and edge security layer (~$145-150/month):
| Resource | Estimated Cost |
|---|---|
| NAT Gateway (1× fixed + data) | ~$32/month |
| Interface endpoints (7× $0.01/hr/AZ × 2 AZs) | ~$102/month |
| Flow logs (CloudWatch ingestion) | ~$3/month |
| DNS Firewall (queries) | <$1/month |
| DNS query log group (CloudWatch ingestion) | ~$1-3/month |
| WAFv2 Web ACL (3 rules + requests) | ~$6/month |
Security Considerations
Section titled “Security Considerations”- Defense in depth — Multiple layers restrict egress: security group (HTTPS-only), DNS Firewall (domain allowlist with observation or enforcement mode), and VPC endpoints (AWS service traffic stays on-network). See the DNS Firewall section for details and limitations.
- AWS service isolation — VPC endpoints keep AWS API traffic on the AWS network, reducing exposure.
- Audit trail — Flow logs record IP-level network activity; DNS query logs record domain-level resolution activity. Together they provide comprehensive egress audit visibility.
- Remaining gap — DNS Firewall does not prevent direct IP-based connections. A connection to
https://1.2.3.4/bypasses DNS resolution entirely. The security group still allows TCP 443 to0.0.0.0/0. This gap is acceptable for the “confused agent” threat model but not for a “sophisticated adversary” threat model. AWS Network Firewall (SNI-based filtering) would close this gap at significantly higher cost (~$274/month/endpoint). - Single NAT Gateway availability risk — The NAT Gateway is deployed in a single AZ to minimize cost (~$32/month vs ~$64/month for two). If that AZ experiences an outage, all agent sessions lose internet egress (GitHub API access). For a platform where sessions run up to 8 hours, losing egress mid-session means the agent cannot push code or create PRs. Mitigation options: (a) Accept the risk for cost-sensitive deployments (single-developer or small-team usage). (b) Add a second NAT Gateway in the other AZ for production deployments — the additional ~$32/month is justified by the availability improvement. (c) Use a NAT instance (cheaper, but operational overhead). The
Blueprintconstruct or stack props should allow configuring single vs. multi-AZ NAT (default: single for cost; opt-in to multi-AZ for production).