Skip to content

Cost model

This document provides an order-of-magnitude cost model for the platform. Cost efficiency is a first-class design principle (see ARCHITECTURE.md). The model covers infrastructure baseline costs, per-task variable costs, and cost attribution guidance.

Detailed cost management (per-user budgets, cost attribution dashboards, token budget enforcement) builds on this baseline analysis and focuses on the dominant cost drivers.

These costs are incurred regardless of task volume:

ComponentEstimated costNotes
NAT Gateway (1×)~$32/monthFixed hourly cost + data processing. Single AZ (see COMPUTE.md - Network architecture).
VPC Interface Endpoints (7×, 2 AZs)~$102/month$0.01/hr × 7 endpoints × 2 AZs × 730 hrs.
VPC Flow Logs~$3/monthCloudWatch ingestion.
DynamoDB (on-demand, idle)~$0/monthPay-per-request; 7 core tables (Tasks, Events, Nudges, Approvals, UserConcurrency, Webhooks, Repo). Integration tables add more when enabled (Slack: installation, user-mapping; Linear: project-mapping, user-mapping, workspace-registry, webhook-dedup). No cost when idle.
S3 Trace Artifacts bucket (idle)~$0/month7-day lifecycle auto-expires objects; no cost when no traces are stored.
EventBridge reconciler rule<$0.01/monthInvokes Lambda every 5 min (288/day). Rule itself is free; Lambda invocation is the cost (see below).
Stranded task reconciler Lambda (idle)<$0.01/month288 invocations/day × 256 MB × ~100 ms avg (early exit when no stranded tasks). ~$0.005/month total (requests + duration).
CloudWatch Logs retention~$1–5/monthDepends on log volume. 90-day retention.
API Gateway (idle)~$0/monthPay-per-request.
Total baseline~$140–150/monthReconciler adds negligible cost; VPC networking remains dominant.

Most platform components are fully serverless and incur zero cost when idle: DynamoDB (PAY_PER_REQUEST, 7 core tables plus integration tables when Slack/Linear are enabled), Lambda, API Gateway, S3 (trace artifacts auto-expire in 7 days), SQS (fanout DLQ), ECS Fargate (cluster is free, when enabled), AgentCore Runtime (per-session), Bedrock (per-token), and Cognito (free tier). The stranded task reconciler adds <$0.01/month even when idle (288 Lambda invocations/day, early-exit). The always-on cost floor (~$140–150/month) is dominated by VPC networking infrastructure (NAT Gateway + 7 interface endpoints across 2 AZs) which is required for private subnet connectivity to AWS services and GitHub. See the Deployment guide for the full scale-to-zero breakdown.

Each task incurs costs proportional to its duration, token consumption, and compute usage. The dominant cost driver is Bedrock model invocation (token cost), not infrastructure.

Cost breakdown per task (order of magnitude)

Section titled “Cost breakdown per task (order of magnitude)”

Assuming a typical task: 1–2 hours, Claude Sonnet, ~100K input tokens, ~20K output tokens per turn, ~50 turns:

ComponentEstimated cost per taskCalculation basis
Bedrock tokens (dominant)$2–15Varies widely by model, task complexity, and turn count. Claude Sonnet: ~$3/M input tokens, ~$15/M output tokens. A 50-turn task with 100K input + 20K output per turn ≈ 5M input + 1M output ≈ $15 + $15 = $30 at list price. Prompt caching reduces this significantly (up to 90% for cache hits). Typical range: $2–15 after caching.
AgentCore Runtime compute$0.10–0.502 vCPU / 8 GB for 1–2 hours. Pricing model is per-session based on vCPU-hours and GB-hours.
Lambda orchestrator<$0.01~10 invocations per task (admission, hydration, polling, finalization). Negligible.
Lambda fanout consumer<$0.01Triggered per batch of task events (batch size 100, 5 s window). Typically 5–20 invocations per task at 256 MB. Negligible.
Lambda nudge / trace / events<$0.01On-demand per user request. Negligible unless heavily polled.
DynamoDB reads/writes<$0.01~30–80 operations per task (task CRUD, events, nudges, counter updates). Negligible.
DynamoDB Streams (fanout)<$0.01Stream reads charged per 25 KB. Typical task: ~20–50 event records. Negligible.
S3 trace upload (if --trace)<$0.01One PUT per task + storage (gzipped NDJSON, typically 50–500 KB, auto-expires in 7 days).
NAT Gateway data<$0.01GitHub API traffic: clone + push. Small repos: <10 MB.
Custom step Lambdas$0–0.05Only if configured. Per-invocation: ~$0.01 per step.
Total per task$2–15Bedrock tokens dominate (>90% of per-task cost). New interactive features add <$0.01 per task.

The screenshot pipeline (see Deploy preview screenshots guide) is opt-in per repo and deterministic — no LLM, no agent runtime. Only fires when a connected deploy provider posts deployment_status: success.

ComponentEstimated cost per screenshotNotes
AgentCore Browser session$0.005–0.015~30–60 s of aws.browser.v1 for navigate + capture. Per-second billing.
Lambda processor<$0.001512 MB, ~10–20 s wall time per invocation.
S3 PutObject + storage<$0.001One PNG (~200 KB–2 MB), 30-day TTL via lifecycle.
CloudFront request + bytes-out<$0.001First-render fetch from GitHub markdown image proxy + a small number of viewer fetches.
Total per screenshot~$0.01Dominated by AgentCore Browser session time.

Baseline overhead (CloudFront distribution + S3 bucket idle) is <$1/month and absorbed into the existing infrastructure baseline above. CloudFront has no per-distribution monthly fee; you pay only per-request and per-byte-out.

A high-volume team with ~500 preview deploys per month would add ~$5/month to the per-task variable line, which is rounding error compared to Bedrock token costs.

FactorImpact on costMitigation
Model choice5–10× between Haiku and OpusDefault to Claude Sonnet; allow per-repo override.
Turn countLinear with turnsmax_turns cap (default 100, configurable 1–500).
Cost budgetHard stop at budgetmax_budget_usd cap (configurable $0.01–$100). Agent stops when budget is reached regardless of remaining turns.
Task durationSub-linear (compute is cheap; tokens dominate)AgentCore: 8-hour service limit; orchestrator: 9-hour executionTimeout.
Prompt caching50–90% token cost reductionEnable by default; cache system prompts and repo context.
ConcurrencyLinear with parallel tasksPer-user and system-wide concurrency limits.
ScaleTasks/monthEstimated monthly cost (infra + tasks)
Low (1 developer)30–60$200–550
Medium (small team)200–500$550–3,000
High (org-wide)2,000–5,000$5,000–30,000

These estimates assume Claude Sonnet with prompt caching enabled and average task complexity.

For multi-user deployments, cost should be attributable to individual users and repositories:

  • Per-task: Token usage and compute duration are captured in task metadata (agent.cost_usd, agent.turns - see OBSERVABILITY.md). Note: agent.cost_usd is the Claude Agent SDK’s client-side estimate (a build-time price table), not authoritative billing — use it for guardrails, and AWS Cost Explorer / CUR 2.0 for the real bill (see COST_ATTRIBUTION.md).
  • Per-user: Aggregate task costs by user_id.
  • Per-repo: Aggregate task costs by repo.
  • Dashboard: Cost attribution dashboards should be built from the same task-level metrics.

For AWS-native chargeback of Bedrock spend (Cost Explorer / CUR 2.0 by user_id / repo, plus per-call invocation-log forensics) — beyond the in-app cost_usd meter above — see the operator guide COST_ATTRIBUTION.md and the platform design BEDROCK_COST_ATTRIBUTION.md.

GuardrailMechanismDefault
Turn limitmax_turns per task100
Cost budgetmax_budget_usd per taskNone (unlimited)
Session timeoutOrchestrator timeout9 hours
Concurrency limitPer-user atomic counter3 concurrent tasks
System concurrencySystem-wide counterAccount-level AgentCore quota
  • Per-user monthly token budgets with alerts at 80% and hard stop at 100%.
  • Per-team monthly cost budgets.
  • Cost attribution dashboard in the control panel.
  • Automated model downgrade (e.g. Sonnet -> Haiku) when approaching budget limits.