Skip to content

Repo onboarding

Before users can submit tasks for a repository, that repository must be onboarded to the platform. Onboarding registers the repo and produces a per-repo configuration that the orchestrator uses at task time: compute strategy, model, credentials, networking, and pipeline customizations. If a user submits a task for a non-onboarded repo, the API returns 422 REPO_NOT_ONBOARDED.

  • Use this doc for: the Blueprint construct interface, RepoConfig schema, override precedence, compute strategy interface, and pipeline customization model.
  • For practical usage: see Quick Start for onboarding your first repo and User Guide for per-repo overrides.
  • Related docs: ORCHESTRATOR.md for how the orchestrator consumes blueprint config, COMPUTE.md for compute backends, SECURITY.md for custom step trust boundaries.

Repositories vary in ways that affect how the agent works: different languages, build systems, toolchains, conventions, and security requirements. A Node.js monorepo needs different tooling than a Python microservice. The onboarding pipeline addresses this by producing a specific configuration per repo, covering:

  • Compute - Runtime image, compute backend, resource profile
  • Agent - Model, turn limits, cost budget, system prompt overrides
  • Security - Credentials, tool access tier, egress rules
  • Pipeline - Custom steps, step ordering, poll interval

The canonical onboarding path is CDK-based. Each repo is an instance of the Blueprint construct in the CDK stack. The construct writes a RepoConfig record to DynamoDB. Deploying the stack = onboarding or updating repos. There is no task-submitter REST API for repo CRUD — the orchestrator gate reads RepoTable at runtime.

This treats blueprints as infrastructure, not runtime config. Each repo’s blueprint defines AWS resources (compute, networking, credentials). CDK manages the lifecycle. The gate (rejecting tasks for non-onboarded repos) reads DynamoDB at runtime, keeping the runtime path simple.

For operators with IAM access to the deployed stack, bgagent repo onboard and bgagent repo offboard write RepoTable directly (no Cognito login, no CDK redeploy). This path ships in #378 / PR #385 ahead of a future POST /v1/repos API.

AspectCDK Blueprintbgagent repo onboard / offboard
WhoDeploy role at mise //cdk:deployOperator AWS credentials (operator-context)
What it writesFull RepoConfig + supporting AWS resourcesRepoTable row only (same schema)
Soft-deletestatus=removed + 30-day TTL on stack removalSame semantics on offboard
Custom runtime / token IAMadditionalRuntimeArns / additionalSecretArns in CDKStored in the row, but orchestrator IAM still requires a CDK deploy
Cedar, egress, pipeline stepsSupported via construct propsNot exposed — use CDK
Audit trailCloudFormation change set + deploy logsCLI stdout only today (see ADR-017)

Use the CLI path for quick day-2 registration with platform defaults (runtime ARN, GitHub token secret). Use CDK when the repo needs durable infrastructure, custom IAM, Cedar policies, egress rules, or pipeline customization. The onboard command prints notes explaining which platform defaults apply and when a redeploy is still required.

See also: Using the CLI — operator commands and ADR-017.

interface BlueprintProps {
repo: string; // "owner/repo"
repoTable: dynamodb.ITable;
compute?: {
type?: 'agentcore' | 'ecs'; // default: 'agentcore'
runtimeArn?: string;
config?: Record<string, unknown>;
};
agent?: {
modelId?: string;
maxTurns?: number;
maxBudgetUsd?: number; // $0.01-$100
memoryTokenBudget?: number; // default: 2000
systemPromptOverrides?: string;
};
security?: {
capabilityTier?: 'standard' | 'elevated' | 'read-only';
cedarPolicies?: string[]; // custom Cedar policies
circuitBreaker?: {
maxCallsPerMinute?: number; // default: 50
maxCostUsd?: number; // default: 10
maxConsecutiveFailures?: number; // default: 5
};
};
credentials?: {
githubTokenSecretArn?: string;
};
networking?: {
egressAllowlist?: string[];
};
pipeline?: {
pollIntervalMs?: number;
customSteps?: CustomStepConfig[];
stepSequence?: StepRef[];
};
}

At deploy time, the construct creates a CDK custom resource that writes (PutItem) the RepoConfig record with status: 'active'. When removed from the stack, it soft-deletes (status: 'removed'). Redeploying with updated props overwrites the record.

The DynamoDB record read at runtime:

interface RepoConfig {
repo: string; // PK
status: 'active' | 'removed';
onboarded_at: string; // ISO 8601
updated_at: string;
compute_type?: string;
runtime_arn?: string;
model_id?: string;
max_turns?: number;
max_budget_usd?: number;
memory_token_budget?: number;
system_prompt_overrides?: string;
github_token_secret_arn?: string;
egress_allowlist?: string[];
poll_interval_ms?: number;
custom_steps?: CustomStepConfig[];
step_sequence?: StepRef[];
}

From lowest to highest priority:

  1. Platform defaults (CDK stack props)
  2. Per-repo config (RepoConfig from Blueprint)
  3. Per-task overrides (API request fields, e.g. max_turns)
FieldDefaultSource
compute_typeagentcorePlatform constant
runtime_arnStack-level env varCDK stack props
model_idClaude Sonnet 4CDK stack props
max_turns100Platform constant
max_budget_usdNone (unlimited)-
memory_token_budget2000Platform constant
github_token_secret_arnStack-level secretCDK stack props
poll_interval_ms30000Orchestrator constant

The orchestrator reads RepoConfig at task time. Each pipeline step consumes specific fields:

StepFields consumed
load-blueprintcompute_type, custom_steps, step_sequence
admission-controlstatus (defense-in-depth)
hydrate-contextgithub_token_secret_arn, system_prompt_overrides
pre-flightgithub_token_secret_arn
start-sessioncompute_type, runtime_arn, model_id, max_turns, max_budget_usd
await-agent-completionpoll_interval_ms
Custom stepscustom_steps[].config

Blueprints customize the orchestrator pipeline through three progressively powerful layers. See ORCHESTRATOR.md for how the framework enforces invariants regardless of customization.

Implementation status: Only Layer 1 is shipped today. The Blueprint construct’s pipeline prop currently exposes a single override, pollIntervalMs (cdk/src/constructs/blueprint.ts); there is no customSteps/stepSequence support, no CustomStepConfig/StepRef wiring, and no INVALID_STEP_SEQUENCE validation in code. Layer 2 (Lambda-backed custom steps) and Layer 3 (custom step sequences) below describe a planned design — tracked as GitHub issues. The interfaces and validation rules in those subsections are forward-looking, not current behavior.

Select and configure built-in step implementations without writing code. Set compute.type, agent.modelId, agent.maxTurns, and other Blueprint props. (The only pipeline-stage override available today is pipeline.pollIntervalMs.)

Layer 2: Lambda-backed custom steps (planned)

Section titled “Layer 2: Lambda-backed custom steps (planned)”

Inject custom logic at pre-agent or post-agent phases:

interface CustomStepConfig {
name: string; // unique step ID
functionArn: string; // Lambda ARN
phase: 'pre-agent' | 'post-agent';
timeoutSeconds?: number; // default: 120
maxRetries?: number; // default: 2
config?: Record<string, unknown>;
}

Override the default step order entirely:

interface StepRef {
type: 'builtin' | 'custom';
name: string;
}

When a stepSequence is provided, the framework will validate it at CDK synth time and at runtime, raising INVALID_STEP_SEQUENCE on misconfiguration. (Planned — see the status note above; not enforced in code today.)

Required steps:

StepWhy
admission-controlConcurrency slot management. Must be first.
pre-flightFail-closed readiness checks. Must precede start-session.
start-sessionStarts compute. Must precede await-agent-completion.
await-agent-completionDetects when agent finishes.
finalizeReleases concurrency, emits events. Must be last.

hydrate-context is not strictly required but omitting it emits a warning. Custom steps can be inserted between any adjacent built-in steps, but not before admission-control or after finalize.

Every step receives a StepInput and returns a StepOutput:

interface StepInput {
taskId: string;
repo: string;
blueprintConfig: FilteredRepoConfig; // filtered per step
previousStepResults: Record<string, StepOutput>; // last 5 steps
}
interface StepOutput {
status: 'success' | 'failed' | 'skipped';
metadata?: Record<string, unknown>; // max 10KB
error?: string;
}

Config filtering: Custom Lambda steps receive a sanitized config with credential ARNs stripped. Steps that need secrets must declare them in config and the operator must grant IAM permissions.

Retry policy: Infrastructure failures (timeout, throttle, 5xx) retry with exponential backoff (default: 2 retries, base 1s, max 10s). Explicit failures (status: 'failed') do not retry.

Checkpoint budget: metadata capped at 10KB per step. previousStepResults pruned to last 5 steps to stay within the 256KB durable execution checkpoint limit.

The compute strategy abstracts how sessions are started and monitored, allowing the orchestrator to work with different backends without code changes:

interface ComputeStrategy {
readonly type: string;
startSession(input: {
taskId: string;
sessionId: string;
payload: HydratedPayload;
config: Record<string, unknown>;
}): Promise<SessionHandle>;
pollSession(handle: SessionHandle): Promise<SessionStatus>;
stopSession(handle: SessionHandle): Promise<void>;
}

The agentcore strategy implements startSession via invoke_agent_runtime, pollSession via re-invocation with sticky routing, and stopSession via stop_runtime_session. Alternative strategies (e.g. ecs) implement the same interface. The backend is selected per repo via compute_type in the Blueprint.

Configurations can become stale as repos evolve. The platform supports re-onboarding through multiple triggers:

TriggerMechanismWhen to use
ManualUpdate Blueprint props + cdk deployKnown major changes (migration, restructure)
On major changeGitHub webhook detects significant changes in default branchAutomated, event-driven
PeriodicEventBridge scheduled re-analysisSafety net for gradual drift

What gets re-onboarded: Container image (rebuilt with updated deps), system prompt and rules (re-discovered from repo files), tool profile, and blueprint config (turn limits, model selection).

What is preserved: Long-term memory (repo knowledge, episodes, review rules) persists across re-onboarding. The memory consolidation strategy handles contradictions. Webhook integrations are also preserved.

The onboarding pipeline can produce two kinds of customization artifacts that help the agent work with a specific repo:

Static artifacts are committed to the repo by the team: CLAUDE.md, .claude/rules/, README, CI config. The pipeline discovers and references these.

Dynamic artifacts are generated by the pipeline when repo hygiene is weak: codebase summaries, dependency graphs, suggested rules from the repo layout. These compensate for missing documentation and are attached to the repo’s agent configuration.

For prompt writing guidelines, see the Prompt Guide.