Skip to main content

Philosophy — AIDLC Meets AgenticOps

This document synthesizes the design premise of oh-my-aidlcops (OMA). It explains why OMA combines an AgenticOps layer with the existing AIDLC framework, why this combination is inevitable, and what this integration actually automates.

Problem Statement — The Incomplete AIDLC Interval

AWS's official awslabs/aidlc-workflows structures the AI-driven development lifecycle into three phases:

  1. Inception — Requirements analysis, user stories, workflow planning
  2. Construction — Component design, code generation, test strategy
  3. Operations — Deployment, monitoring, incident response, cost management

Inception and Construction are naturally automated because agent-driven planning and implementation are intuitive work for agents. However, Operations requires observation, judgment, and action in live environments. Most AIDLC implementations have left this phase as a human execution domain.

As a result, the lifecycle is structurally incomplete. Feedback from operations (errors, latency, cost overruns, compliance violations) loops back through documentation and issue trackers to Construction with week-long delays, and information is lost in transit.

OMA's Premise

AIDLC becomes complete only when operations is automated by agents. Humans approve; agents execute.

This premise contains two claims:

  1. Operations = automatable — Modern observability stacks (Langfuse, Prometheus, CloudWatch) combined with AWS Hosted MCP provide agents with data planes sufficient to delegate operational judgment.
  2. Approval ≠ execution — Humans retain approval authority at Tier-0 checkpoints, but agents own diagnosis, proposal, deployment, rollback, and tuning execution.

AgenticOps Layer

Through the agenticops plugin, OMA injects five skills into the operations phase continuously:

SkillRoleKey InputKey Output
self-improving-loopTrace-based skill and prompt improvementLangfuse traces, failure patternsPR to aidlc-construction
autopilot-deployAutonomous deployment of validated artifactsCI success artifacts, policy gatesGitOps commits, rollout events
incident-responseAlarm → diagnosis → proposal → actionPagerDuty, CloudWatch alarmsRCA draft, auto-mitigation actions
continuous-evalSustained quality assessmentRagas metrics, regression datasetsQuality report, rollback signals
cost-governanceCost anomaly detection and controlAWS Cost Explorer, budget policyScale recommendations, approval requests

Feedback Loop Structure

The core of this loop is the automated Operations → Construction reverse flow. In traditional AIDLC implementations, this arrow depended on human issue classification and backlog management. In OMA, self-improving-loop analyzes trace patterns and generates concrete skill and prompt fix PRs.

Reference Design — Self-Improving Agent Loop

OMA's feedback loop concept is based on the Self-Improving Agent Loop ADR in the engineering-playbook project. That ADR specifies as design decisions:

  • Trace collection cadence and sampling strategy
  • Failure pattern taxonomy (Prompt / Skill / Tool / Infra)
  • Scope constraints for auto-improvement PRs (non-destructive, regression tests required)
  • Separation of human review gates and auto-merge policy

See the links below for detailed decision rationale and alternative comparisons.

  • Self-Improving Agent Loop (design) (community resource)
  • ADR: Self-Improving Loop (decision) (community resource)

AgenticOps and Traditional DevOps Relationship

AgenticOps does not replace DevOps, SRE, or MLOps. It shares the same observability stack and deployment pipeline but differs only in who executes: agents instead of pipelines.

AspectTraditional DevOps/SREOMA AgenticOps
Deployment triggerHuman merge → pipeline runsAgent confirms policy gates, autonomous deploy
Incident responsePagerDuty → on-call engineerAlarm → incident-response skill → human approval then action
Quality gatesCI tests passCI + Ragas + ongoing regression sampling
Cost controlMonthly reviewReal-time anomaly detection and auto-scaling recommendations
Improvement loopRetrospective meetingTraces → auto-improvement PR

Design Principles

OMA adheres to these principles in implementation choices (source: CLAUDE.md <operating_principles>):

  1. AIDLC 3-phase is the basic unit of work — Institutional prevention of phase skipping (Phase gate).
  2. Operations default to automation — Manual intervention is not the default.
  3. Specialized work delegated to appropriate plugins — No single agent does everything.
  4. engineering-playbook is the knowledge single source of truth — Skills maintain summaries and links only.
  5. AWS Hosted MCP is the default runtime data plane — No custom MCP servers until a clear gap is identified.

Expected Impact

Teams adopting OMA can expect the following quantitative changes (early targets):

MetricLegacyGoalMeasurement
Issue → improvement deployment lead timeWeek scaleDay scaleGitHub Issue open → PR merge
Mean incident response time30–60 minutesUnder 10 minutesAlarm triggered → mitigation complete
Regression detection rateCI tests onlyCI + Ragas + regression samples24-hour post-deployment quality report
Manual ops work ratio40%+10% or lessManual effort outside checkpoints

Numbers vary by environment. Continuous measurement is performed via agenticops/continuous-eval skill.

Philosophical Foundation — AIDLC as an "Approval System"

A final premise is governance. As agent autonomy increases, governance's unit shifts from "execution unit" to "approval point." OMA defines Tier-0 checkpoints as these approval points and delegates all work between checkpoints to agents. This means:

  • Audit logs are condensed to checkpoint units rather than per-execution-stage.
  • Human focus shifts from "who executed what" to "under what policy was this approved."
  • Governance of non-deterministic agent execution requires explicit, version-controlled checkpoint policies.

Reference Materials

Official Documentation

Reference ADR and Design Documents

  • Self-Improving Agent Loop Design (community resource) — Traces → improvement loop design
  • ADR: Self-Improving Loop (community resource) — Decision rationale
  • Agentic AI Platform Architecture (community resource) — Overall platform structure

OMA Internal Documentation