Skip to content

Security

ABCA agents execute code with repository access. This document describes how the platform contains that risk: isolated sessions, scoped credentials, input screening, policy enforcement, and memory integrity controls. The design aligns with AWS prescriptive guidance for agentic AI security.

  • Use this doc for: understanding the security boundaries, what can go wrong, and how the platform mitigates each threat.
  • Related docs: COMPUTE.md for runtime isolation details, MEMORY.md for memory threat analysis, REPO_ONBOARDING.md for per-repo security configuration, INPUT_GATEWAY.md for authentication flows.

Security by default. Isolated sandboxed environments, least-privilege credentials, and fine-grained access control are non-negotiable. The blast radius of any agent mistake is limited to one branch in one repository.

Each task runs in its own isolated session with dedicated compute, memory, and filesystem (a MicroVM). No storage or context is shared between sessions, which prevents data leakage between users and tasks and contains compromise to a single session.

  • Lifecycle - Sessions are created per task and destroyed when the task ends. Temporary resources are discarded on termination.
  • Identifiers - Session and task IDs partition all state. The runtime encapsulates conversation history, reasoning state, and retrieved knowledge per session.
  • Timeouts - Duration and idle timeouts prevent resource leaks and unbounded sessions.

The agent runs with full permissions inside the sandbox but cannot escape it. The security boundary is the isolated runtime (MicroVM), not in-agent permission prompts.

  • Worst case - A compromised agent can affect one branch in one repo. It can create or modify code and open a PR. It cannot touch other repos, other users’ tasks, or production.
  • Human review - PR review is the final gate before merge. The agent cannot merge its own PRs.
  • No shared state - Tasks do not share memory or storage. One compromised session cannot corrupt another.

Two authentication mechanisms protect the platform, matching the two input channels:

ChannelMechanismDetails
CLI / REST APIAmazon Cognito JWTUsers authenticate and receive tokens. The input gateway verifies every request.
WebhooksHMAC-SHA256Per-integration shared secrets stored in Secrets Manager. Secrets are shown once at creation and scheduled for deletion with a 7-day recovery window on revocation.

Authorization is user-scoped: any authenticated user can submit tasks, but users can only view and cancel their own tasks (user_id enforcement). Webhook management enforces ownership with 404 (not 403) to avoid leaking webhook existence.

Agent credentials - GitHub access currently uses a PAT stored in Secrets Manager. The orchestrator reads the secret at hydration time and passes it to the agent runtime. The model never receives the token in its context. Planned: replace the shared PAT with a GitHub App via AgentCore Identity Token Vault, providing per-task, repo-scoped, short-lived tokens (see ROADMAP.md).

Input screening happens at two points in the pipeline, forming a defense-in-depth chain. Content that passes submission screening is screened again during hydration when external data (GitHub issues, PR comments) is added to the prompt.

  • Input validation - Required fields, types, and size limits are enforced before any processing. Task descriptions are capped at 2,000 characters.
  • Bedrock Guardrails - A PROMPT_ATTACK content filter at HIGH strength screens task descriptions for prompt injection.
  • Fail-closed - If the Bedrock API is unavailable, submissions are rejected (HTTP 503). Unscreened content never reaches the agent.
  • PR tasks (pr_iteration, pr_review) - The assembled prompt (PR body, review comments, diff, task description) is screened through Bedrock Guardrails before the agent receives it.
  • new_task with issue content - The assembled prompt (issue body, comments, task description) is screened. When no issue content is present, hydration-time screening is skipped because the task description was already screened at submission.
  • Fail-closed - A Bedrock outage during hydration fails the task. A guardrail_blocked event is emitted when content is blocked.

The agent’s tools are allowlisted. An unrestricted tool surface increases the risk of confused deputy attacks and unintended data exfiltration. ABCA follows a tiered model:

TierScopeTools
Default (all repos)Minimal, predictableBash (allowlisted subcommands), git (limited), verify (formatters, linters, tests), filesystem (within sandbox)
Extended (opt-in per repo)Additional capabilitiesMCP servers, plugins, code search, documentation lookup

Per-repo tool profiles are stored in onboarding config and loaded during context hydration. AgentCore Gateway enforces which tools are reachable at the platform level (not a prompt-level suggestion). For tools not mediated by the Gateway (bash, filesystem), enforcement relies on sandbox permissions, network egress rules, and the bash allowlist.

The blueprint framework (REPO_ONBOARDING.md) allows per-repo custom Lambda steps in the orchestrator pipeline. These are a trust boundary that requires specific attention.

Deployment control - Custom steps are defined in the Blueprint CDK construct and deployed via cdk deploy. Only principals with CDK deployment permissions can add or modify them. There is no runtime API for custom step CRUD.

The same deploy-only property extends to Blueprint.security.cedarPolicies — user-authored Cedar policies live in the CDK source, are typed as readonly string[] on the construct, and reach RepoTable only through a CloudFormation custom resource invoked at deploy time. Phase 3 (Cedar-driven HITL approval gates — see PHASE3_CEDAR_HITL.md) is load-bearing on this property: the engine treats Cedar policies loaded at task start as trusted content. If the blueprint model ever changes to accept user-uploaded policy text via an API path, Phase 3’s §12 trust model must be re-evaluated (add per-blueprint policy count cap, per-eval timeout, size cap).

Input filtering - The framework strips credential ARNs (github_token_secret_arn) and networking configuration (egress_allowlist) from the config before passing it to custom Lambda steps. If a custom step needs secrets, it must declare them explicitly and the operator must grant IAM permissions.

What a custom step can do:

  • Fail or delay the pipeline (up to its timeout)
  • Return misleading metadata that influences later steps

What a custom step cannot do:

  • Skip framework invariants (state transitions, events, cancellation, concurrency)
  • Access other tasks’ context
  • Modify the step sequence at runtime
  • Bypass admission control or concurrency limits

Cross-account - functionArn should be validated at CDK synth time to ensure it belongs to the same account. Cross-account invocation requires explicit opt-in (allowCrossAccountSteps: true).

The platform is self-hosted in the customer’s AWS account. No code or repo data is sent to third-party infrastructure by default. Multiple layers provide defense in depth:

LayerMechanismWhat it protects against
EdgeAWS WAFv2 (common rules, known bad inputs, rate limit: 1,000 req/5 min/IP)Web exploits, volumetric abuse
NetworkDNS Firewall domain allowlist (GitHub, npm, PyPI, AWS services)Agent reaching unauthorized domains
NetworkSecurity group egress restricted to TCP 443Non-HTTPS traffic
ComputeMicroVM isolation per sessionCross-session compromise
CredentialsSecrets Manager with scoped IAMCredential theft
AuditBedrock model invocation logging (90-day retention)Prompt injection investigation, compliance
DeploymentCDK infrastructure as codeConsistent, auditable deployments

DNS Firewall note: Currently in observation mode (non-allowlisted domains are logged as ALERT but not blocked). Per-repo egressAllowlist entries are aggregated into the platform-wide policy. DNS Firewall does not block direct IP connections, which is acceptable for the “confused agent” threat model but not for sophisticated adversaries. See COMPUTE.md for the enforcement rollout process.

The platform enforces policies at multiple points in the task lifecycle. Today, these are implemented inline across handlers, constructs, and agent code. A centralized Cedar-based policy framework is planned (see ROADMAP.md).

flowchart LR
    subgraph Submission
        A[Input validation] --> B[Repo onboarding gate]
        B --> C[Guardrail screening]
        C --> D[Idempotency check]
    end
    subgraph Orchestration
        E[Concurrency limit] --> F[Pre-flight checks]
        F --> G[Guardrail prompt screening]
        G --> H[Budget/quota resolution]
    end
    subgraph Execution
        I[Cedar tool-call policy] --> J[Output secret screening]
        J --> K[Turn/cost budget]
    end
    subgraph Finalization
        L[Build/lint verification]
    end
    Submission --> Orchestration --> Execution --> Finalization
PhasePolicyLocationAudit
SubmissionInput validationvalidation.ts, create-task-core.tsHTTP error only
SubmissionRepo onboarding gaterepo-config.tsHTTP error only
SubmissionGuardrail screeningcreate-task-core.tsHTTP error only
AdmissionConcurrency limitorchestrator.tsadmission_rejected event
Pre-flightGitHub access, PAT permissions, PR accesspreflight.tspreflight_failed event
HydrationGuardrail prompt screeningcontext-hydration.tsguardrail_blocked event
HydrationBudget/quota resolutionorchestrator.tsPersisted on task record
ExecutionTool-call policy (Cedar)agent/src/hooks.py, agent/src/policy.pyPOLICY_DECISION telemetry
ExecutionOutput secret screeningagent/src/output_scanner.pyOUTPUT_SCREENING telemetry
ExecutionTurn/cost budgetClaude Agent SDKCost in task result
FinalizationBuild/lint verificationagent/src/post_hooks.pyTask record and PR body
InfrastructureDNS Firewall, WAFCDK constructsCloudWatch logs

Audit gap: Submission-time rejections currently return HTTP errors without structured audit events. Planned: a unified PolicyDecisionEvent schema across all phases (see ROADMAP.md).

Once an agent session starts, two mechanisms enforce policy without requiring an external sidecar:

Tool-call interceptor (Guardian pattern). A Cedar-based policy engine (agent/src/policy.py) evaluates tool calls via the Claude Agent SDK’s hook system:

  • Pre-execution (PreToolUse hook) - Validates tool inputs before execution. pr_review agents cannot use Write/Edit. Writes to .git/* are blocked. Destructive bash commands are denied. Fail-closed: if Cedar is unavailable, all calls are denied. Per-repo custom Cedar policies are supported via Blueprint security.cedarPolicies.
  • Post-execution (PostToolUse hook) - Screens tool outputs for secrets (AWS keys, GitHub tokens, private keys, connection strings). Detected secrets are redacted before re-entering the agent context (steered enforcement, not blocking).

Behavioral circuit breaker. Monitors tool-call patterns within a session: call frequency, cumulative cost, repeated failures, and file mutation rate. When thresholds are exceeded (e.g. >50 calls/min, >$10 cost, >5 consecutive failures), the session is paused or terminated. Thresholds are configurable per-repo via Blueprint security props.

The platform’s memory system (MEMORY.md) faces threats from both intentional attacks and emergent corruption. OWASP classifies memory poisoning as ASI06 in the 2026 Top 10 for Agentic Applications, recognizing that persistent memory attacks are fundamentally different from single-session prompt injection: poisoned entries influence every subsequent interaction.

VectorDescriptionEntry point
PR review comment injectionMalicious instructions disguised as review rules get stored as persistent memorypr_iteration hydration
Query-based injection (MINJA)Crafted task descriptions embed content the agent stores as legitimate memoryTask submission
GitHub issue injectionAdversarial issue content containing memory-poisoning payloadsnew_task hydration
Experience graftingManipulated episodic memory induces behavioral driftPost-task memory extraction
Poisoned RAG retrievalContent engineered to rank highly for specific semantic queriesMemory retrieval
Self-corruptionHallucination crystallization, error feedback loops, stale context accumulationAgent’s own memory writes
  1. Input moderation with trust scoring - Content sanitization and injection pattern detection before memory write. sanitizeExternalContent() strips HTML injection, prompt injection patterns, control characters, and bidi overrides. Content trust metadata (trusted, untrusted-external, memory) tags each source.
  2. Provenance tagging - Every memory entry carries source type, content hash (SHA-256), and schema version. Hashes serve as audit trail (not retrieval gates, since AgentCore’s extraction pipeline legitimately transforms content).
  3. Storage isolation - Per-repo namespace isolation, expiration limits, and size caps. For multi-tenant deployments, separate AgentCore Memory resources per organization (silo model).
  4. Guardrail screening - Assembled prompts are screened through Bedrock Guardrails before reaching the agent (fail-closed).
  5. Review feedback quorum - Only promote feedback to persistent rules if the same pattern appears from multiple trusted reviewers across multiple PRs. Single review comments never become permanent rules.
  6. Blast radius containment - Even if poisoned rules get through, the agent cannot modify CI/CD pipelines, change branch protection, access secrets beyond its scoped token, or push to protected branches.

Planned: Trust-scored retrieval with temporal decay, anomaly detection on write patterns, and write-ahead guardian validation (see ROADMAP.md).

  • Point-in-time recovery (PITR) on all tables (Tasks, TaskEvents, UserConcurrency, Webhooks). 35-day retention, per-second granularity.
  • On-demand backups before major deployments or schema migrations.

AgentCore Memory has no native backup mechanism. Mitigation:

  • Periodic S3 export - Scheduled Lambda exports memory records per namespace to a versioned S3 bucket (s3://bgagent-memory-backups/{date}/{namespace}.json).
  • Purge mechanism - Search by namespace and time range, delete via delete_memory_records. S3 exports provide pre-poisoning restore capability.
ScenarioProcedureRTO
DynamoDB corruptionRestore from PITR to new tableMinutes to hours
Poisoned memory ruleQuery namespace + content search, deleteMinutes
Bulk memory corruptionRestore from S3 export, re-importHours
LimitationRiskMitigation
Shared GitHub PATOne token for all repos. No per-user repo scoping.Planned: GitHub App + AgentCore Token Vault for per-task, repo-scoped tokens
Input-only Bedrock GuardrailsModel output during execution is not screened by GuardrailsPostToolUse hook screens tool outputs for secrets/PII via regex
No memory rollback365-day expiration is the only cleanupS3 exports provide manual restore capability
No MFACognito MFA disabled for CLI auth flowEnable for production deployments
No customer-managed KMSAWS-managed encryption keysAdd customer-managed KMS if required by compliance
CORS fully openALL_ORIGINS configured for CLIRestrict origins for browser clients
DNS Firewall IP bypassDirect IP connections bypass DNS filteringAcceptable for confused-agent threat model. AWS Network Firewall for stronger enforcement.
No AgentCore Memory IAM isolationAll namespaces accessible if principal can access the agent’s memoryPool model (application-layer scoping) for single-org; silo model (separate resources) for multi-tenant