Enhanced SWE IDEs - Code Architecture Review
Use Nova 2 reasoning models to perform system-wide architecture analysis performance implications, security considerations, and dependency impact assessment.
System Prompt Template
You are an expert software architect who specializes in {technology stack} and system-wide impact analysis.
Evaluate architectural decisions considering performance, security, maintainability, and long-term implications across complex systems.
User Prompt Template
## System Architecture ##
{ system description and codebase}
## Architecture Challenge ##
{complex architectural decision or migration scenario}
Ask clarifying questions if needed.
Example
Amazon Nova 2 Lite System Prompt
You are an expert software architect who specializes in large-scale distributed systems and microservices architecture.
Evaluate architectural decisions considering performance implications, security vulnerabilities, maintainability challenges, scalability constraints, and long-term technical debt across complex enterprise systems with millions of users.
**Important formatting requirements:**
- Use proper markdown tables with | separators and header rows
- Format all tables as: | Column 1 | Column 2 | with |---|---| separator
- Use bullet points and numbered lists for clarity
- Include mermaid diagrams when helpful for visualization
Amazon Nova 2 Lite User Prompt
## System Architecture ##
**Current Monolithic E-commerce Platform (500K LOC)**
```python
# Core monolith structure
ecommerce_platform/
├── user_management/ # 50K LOC - User auth, profiles, preferences
│ ├── authentication.py # OAuth2, JWT, session management
│ ├── user_profiles.py # Profile CRUD, preferences, settings
│ └── permissions.py # RBAC, admin controls
├── product_catalog/ # 80K LOC - Product data, search, recommendations
│ ├── product_models.py # Complex product hierarchy, variants
│ ├── search_engine.py # Elasticsearch integration, faceted search
│ ├── recommendations.py # ML-based product recommendations
│ └── inventory.py # Real-time inventory tracking
├── order_processing/ # 120K LOC - Cart, checkout, fulfillment
│ ├── shopping_cart.py # Session-based cart, persistence
│ ├── checkout.py # Multi-step checkout, validation
│ ├── payment_processing.py # Multiple payment gateways, PCI compliance
│ ├── order_fulfillment.py # Warehouse integration, shipping
│ └── returns.py # Return processing, refunds
├── customer_service/ # 60K LOC - Support, chat, tickets
│ ├── support_tickets.py # Ticket management, SLA tracking
│ ├── live_chat.py # Real-time chat, agent routing
│ └── knowledge_base.py # FAQ, documentation system
├── analytics/ # 90K LOC - Reporting, BI, ML pipelines
│ ├── event_tracking.py # User behavior tracking, analytics
│ ├── reporting.py # Business intelligence, dashboards
│ └── ml_pipelines.py # Recommendation training, A/B testing
└── shared/ # 100K LOC - Common utilities, integrations
├── database.py # ORM, connection pooling, migrations
├── caching.py # Redis, memcached, cache strategies
├── messaging.py # RabbitMQ, event publishing
└── external_apis.py # Payment, shipping, tax integrations
```
**Current Performance Metrics**
- Response time: 200ms (p50), 800ms (p95), 2.5s (p99)
- Throughput: 5,000 requests/minute peak
- Database: PostgreSQL, 2TB data, 500 concurrent connections
- Infrastructure: 12 application servers, 3 database replicas
- Deployment: 45-minute deployments, 2-week release cycles
- Incidents: 15 production issues/month, 99.2% uptime
**Scaling Challenges**
- Database bottlenecks during flash sales (10x traffic spikes)
- Memory usage: 8GB per app server, frequent GC pauses
- Code coupling: Changes in one module often break others
- Testing: 6-hour test suite, flaky integration tests
- Team coordination: 25 developers, merge conflicts, blocking dependencies
**Business Requirements**
- Scale to 50,000 requests/minute (10x current)
- Support international expansion (5 new countries)
- Reduce deployment time to <10 minutes
- Improve developer velocity (faster feature delivery)
- Achieve 99.9% uptime SLA
- Support real-time features (live inventory, chat)
**Technology Constraints**
- Must maintain PCI DSS compliance
- Legacy integrations with 15 external systems
- Existing PostgreSQL data (complex migration)
- Team expertise: Strong Python, limited microservices experience
- Budget: $2M for infrastructure changes over 18 months
## Architecture Challenge ##
Evaluate whether TechFlow should migrate from their monolithic architecture to microservices, and if so, design the optimal migration strategy.
Provide architectural analysis covering:
1. **System-Wide Impact Assessment**: Analyze how microservices migration would affect each component. Consider data consistency requirements, transaction boundaries, and inter-service communication patterns. Identify which modules are good candidates for extraction and which should remain coupled.
2. **Performance Implications Analysis**: Model the performance impact of network latency, service-to-service calls, and distributed transactions. Compare current monolith performance with projected microservices performance under various load scenarios. Consider caching strategies, data locality, and query optimization.
3. **Security Architecture Review**: Evaluate security implications of distributed architecture. Analyze authentication/authorization across services, network security, data encryption in transit/at rest, and compliance requirements (PCI DSS). Identify new attack vectors and mitigation strategies.
4. **Dependency and Data Flow Analysis**: Map current tight coupling and shared database dependencies. Design service boundaries that minimize cross-service transactions. Analyze data consistency requirements and propose eventual consistency vs strong consistency trade-offs.
5. **Migration Strategy & Risk Assessment**: Design a phased migration approach that minimizes business disruption. Identify the optimal order for service extraction, considering dependencies and risk levels. Propose strangler fig pattern implementation with specific timelines and rollback strategies.
6. **Operational Complexity Evaluation**: Assess the operational overhead of managing 15-20 microservices vs 1 monolith. Consider monitoring, logging, debugging, deployment pipelines, and team structure changes. Evaluate DevOps tooling requirements and learning curve.
7. **Alternative Architecture Recommendations**: If microservices aren't optimal, propose alternative approaches (modular monolith, service-oriented architecture, domain-driven design within monolith). Compare trade-offs and provide decision framework.
Consider the team's current expertise, budget constraints, and business timeline. Provide specific recommendations for technology choices, infrastructure requirements, and organizational changes needed for success.
Ask clarifying questions if needed.
Amazon Nova 2 Lite Sample Response
Response
Architectural Evaluation for TechFlow Monolith-to-Microservices Migration
1. System-Wide Impact Assessment
Service Candidate Analysis
| Component | Extraction Candidate? | Rationale | Data Consistency Needs | Transaction Boundaries |
|---|---|---|---|---|
| User Management | Yes | Independent domain; low cross-component dependencies. | Strong (user auth) | Local to service; eventual for preferences |
| Product Catalog | Yes | Statable; read-heavy; can use eventual consistency for recommendations. | Eventual (recommendations) | Strong for inventory; eventual for search |
| Order Processing | Partial | High complexity; payment processing must remain tightly coupled with checkout. | Strong (payments) | Split into: - Checkout Service (strong) - Fulfillment Service (eventual) |
| Customer Service | Yes | Isolated I/O-bound workflows; async communication with tickets. | Eventual | Per-ticket boundaries |
| Analytics | Yes | Batch-processing; can run independently with event streams. | Eventual | Event-driven pipelines |
| Shared Utilities | No | Core infrastructure; extraction would increase latency. Keep as shared library. | N/A | N/A |
Key Insights
- Good Candidates: User Management, Product Catalog, Customer Service, Analytics (independent domains, clear boundaries).
- Remains Coupled:
- Order Processing (payment validation requires strong ACID transactions).
- Shared Utilities (database, caching, messaging) – refactor into platform team libraries.
- Data Flow Impact:
- Strong Consistency Required: User auth, payment processing.
- Eventual Consistency Acceptable: Recommendations, inventory updates, analytics.
2. Performance Implications Analysis
Current vs. Projected Performance
| Metric | Monolith (Current) | Microservices (Projected) | Change | Risk Mitigation |
|---|---|---|---|---|
| P50 Response Time | 200 ms | 250–300 ms | +25–50% | Edge caching, CDN, async APIs |
| P95 Response Time | 800 ms | 900–1.2s | +12.5–50% | Service mesh (Istio), circuit breakers |
| P99 Response Time | 2.5 s | 2–3.5s | +20% max | Bulkheading, retries with backoff |
| Throughput | 5K req/min | 50K req/min | 10x | Kubernetes HPA, auto-scaling groups |
| Database Bottlenecks | High (flash sales) | Reduced via sharding | Resolved | Database per service + read replicas |
| Memory Usage | 8GB/server | 2–4GB/service | Reduced | Container resource limits |
Performance Optimization Strategies
- Network Latency:
- Use gRPC for internal services (lower overhead vs REST).
- Service Mesh (e.g., Istio) for observability and retries.
- Data Locality:
- Embedded caches (Redis) per service for hot data (e.g., inventory).
- CQRS for product catalog (separates reads/writes).
- Query Optimization:
- Materialized views for analytics; event sourcing for order state.
3. Security Architecture Review
New Attack Vectors & Mitigations
| Vulnerability | Risk Level | Mitigation Strategy | Compliance Impact (PCI DSS) |
|---|---|---|---|
| Service-to-Service Auth | High | mTLS (mutual TLS) via service mesh | Required for inter-service calls |
| Data Leakage | Medium | Secret management (Vault); encryption at rest per service | Encryption of PCI data in transit/storage |
| API Governance | High | API Gateway (Kong/Apigee) with rate limiting, JWT validation | Centralized payment tokenization |
| Network Exfiltration | Critical | VPC peering; service mesh network policies | Segmented PCI data networks |
| Insecure Deserialization | Medium | Input validation per service; OWASP patches | Validate payment payloads |
Key Recommendations
- Centralized Identity: Deploy Keycloak/OIDC for cross-service auth.
- PCI Compliance:
- Isolate payment processing in dedicated VLAN.
- Use tokenization (e.g., Stripe) to avoid storing raw card data.
- Zero Trust: Strict RBAC per service; audit logs for all PCI-related flows.
4. Dependency and Data Flow Analysis
Current Tight Coupling Map
graph TD
A[User Management] --> B[Order Processing]
C[Product Catalog] --> B
D[Analytics] --> C
B --> E[Payment Gateways]
B --> F[Warehouse Systems]
Proposed Service Boundaries
| Service | External Dependencies | Data Ownership | Consistency Strategy |
|---|---|---|---|
| User Service | None (self-contained) | Users, Roles, Preferences | Strong (local DB) |
| Catalog Service | External search (Elasticsearch) | Products, Inventory | Eventual (inventory updates via events) |
| Order Service | Payment Gateways, Warehouse | Orders, Payments | Strong (local DB) |
| Customer Service | None | Tickets, Chat History | Eventual (updates via events) |
| Analytics Service | Event Streams | Reporting DB (Data Warehouse) | Eventual |
Data Flow Optimization
- Event-Driven Architecture:
- Use Kafka for async communication (e.g., inventory updates → catalog service).
- Saga Pattern for distributed transactions (e.g., order placement → payment → fulfillment).
- Data Duplication:
- Replicate read-heavy data (e.g., product prices) into catalog service for fast access.
- Avoid Shared DB: Each service owns its DB schema; use change data capture (CDC) for sync.
5. Migration Strategy & Risk Assessment
Phased Migration Plan (Strangler Fig Pattern)
gantt
title Migration Timeline (18 Months)
dateFormat YYYY-MM-DD
section Phase 1: Foundation (Months 1-3)
Infrastructure Setup :a1, 2023-01-01, 3m
Platform Team Training :a2, after a1, 3m
section Phase 2: Low-Risk Services (Months 4-6)
User Service Extraction :2023-04-01, 3m
Catalog Service Extraction :2023-05-01, 3m
section Phase 3: Medium-Risk Services (Months 7-9)
Customer Service Extraction :2023-07-01, 3m
Analytics Service Extraction :2023-08-01, 3m
section Phase 4: High-Risk Services (Months 10-12)
Order Service Refactor :2023-10-01, 4m
Payment Subservice Extraction :2023-11-01, 3m
section Phase 5: Optimization & Scaling (Months 13-18)
Data Replication & CDC :2023-13-01, 3m
Performance Tuning :2023-15-01, 3m
Risk Mitigation
| Risk | Mitigation Strategy | Rollback Plan |
|---|---|---|
| Data Inconsistency | Dual-write with eventual consistency; Saga pattern | Revert to monolith DB snapshot |
| Service Interruption | Blue-green deployments; canary releases | Switch traffic back to monolith |
| Performance Degradation | Load testing pre-merge; performance budgets | Scale monolith temporarily |
| Team Skill Gaps | Dedicated training sprint; hire SRE experts | Pause migration; reinforce training |
6. Operational Complexity Evaluation
Operational Overhead Comparison
| Area | Monolith | Microservices (15-20 services) | Mitigation Tools |
|---|---|---|---|
| Monitoring | App-level metrics (Prometheus) | Distributed tracing (Jaeger); service mesh metrics | Grafana dashboards, alerting rules |
| Logging | Centralized (ELK) | Log aggregation (Fluentd → Loki) | Standardized JSON logs |
| Debugging | Direct stack traces | Distributed tracing required | Service mesh, correlation IDs |
| Deployment | Single pipeline (45 min) | CI/CD per service (<5 min) | ArgoCD/GitOps, shared pipeline library |
| Team Structure | 25 devs (all in one team) | Platform team (SRE) + Product teams (5-6 devs/service) | Conway’s Law alignment |
| Infrastructure | 12 servers | Kubernetes cluster (50+ nodes) | Managed K8s (EKS/GKE) |
DevOps Tooling Requirements
- Infrastructure as Code: Terraform for cloud resources.
- Service Mesh: Istio for traffic management, security, observability.
- Observability Stack: Prometheus + Grafana + Jaeger + Loki.
- CI/CD: Jenkins/GitLab CI with service-specific pipelines.
7. Alternative Architecture Recommendations
Decision Framework: Microservices vs. Modular Monolith
| Criterion | Microservices | Modular Monolith |
|---|---|---|
| Scalability | ✅ Horizontal scaling per service | ❌ Limited vertical scaling |
| Team Autonomy | ✅ Teams own services (fast releases) | ❌ Merge conflicts, blocking dependencies |
| Deployment Speed | ✅ <10 min per service | ❌ 45 min full redeploy |
| Uptime SLA | ✅ Isolation improves resilience | ❌ Single point of failure |
| Operational Overhead | ❌ High (monitoring, networking) | ✅ Low (single deployment) |
| Learning Curve | ❌ Steep (SRE skills required) | ✅ Current team expertise (Python) |
| Budget Fit | ✅ $2M over 18 months covers K8s + tools | ✅ Lower initial cost |
Recommendation: Hybrid Approach
- Migrate to Modular Monolith First (Months 1-6):
- Break monolith into well-defined modules (packages) with clear interfaces.
- Implement domain-driven design (DDD) boundaries.
- Add internal API gateway for HTTP/gRPC calls between modules.
- Then Extract Services Incrementally (Months 7-18):
- Target independent domains (User, Catalog, Analytics).
- Retain shared utilities (database, messaging) as platform libraries.
- Use strangler fig to replace monolith modules with services.
Why Hybrid? - Lower Risk: Teams retain familiarity while adopting microservices incrementally. - Cost-Effective: Avoids full K8s overhaul upfront; uses existing infrastructure. - Compliance: Easier PCI segmentation by isolating payment service early.
Final Recommendations
- Adopt Hybrid Strategy:
- Phase 1 (0-6 months): Modularize monolith; implement DDD; add internal API gateway.
-
Phase 2 (7-18 months): Extract 4-5 services (User, Catalog, Customer, Analytics) using strangler fig.
-
Technology Stack:
- Containerization: Docker + Kubernetes (managed).
- Service Mesh: Istio for security and observability.
- Event Streaming: Kafka for async communication.
-
Observability: Prometheus, Grafana, Jaeger, Loki.
-
Organizational Changes:
- Form Platform Team (4-5 SREs) to manage K8s, networking, monitoring.
- Split dev team into product squads (4-6 devs/service).
-
Invest in microservices training (3-month bootcamp).
-
Budget Allocation:
- $800K: Infrastructure (K8s clusters, monitoring tools).
- $500K: Training and hiring (SREs, DevOps).
- $400K: Tooling (service mesh, CI/CD pipelines).
-
$300K: Buffer for unexpected issues.
-
Success Metrics:
- Reduce deployment time to <10 minutes within 9 months.
- Achieve 99.9% uptime by Q4 of Year 1.
- Cut production incidents to <5/month by Month 12.
Verdict: Migrate to microservices but via a hybrid, phased approach to mitigate risk while leveraging current team strengths. Prioritize independent domains first and retain strong governance over shared platform components.
API Request
import boto3
import json
# Initialize the Bedrock Runtime client
bedrock_runtime = boto3.client('bedrock-runtime', region_name='us-west-2')
# Define the system prompt
system_prompt = """You are an expert software architect who specializes in large-scale distributed systems and microservices architecture.
Evaluate architectural decisions considering performance implications, security vulnerabilities, maintainability challenges, scalability constraints, and long-term technical debt across complex enterprise systems with millions of users.
**Important formatting requirements:**
- Use proper markdown tables with | separators and header rows
- Format all tables as: | Column 1 | Column 2 | with |---|---| separator
- Use bullet points and numbered lists for clarity
- Include mermaid diagrams when helpful for visualization"""
# Define the user prompt
user_prompt = """## System Architecture ##
**Current Monolithic E-commerce Platform (500K LOC)**
```python
# Core monolith structure
ecommerce_platform/
├── user_management/ # 50K LOC - User auth, profiles, preferences
│ ├── authentication.py # OAuth2, JWT, session management
│ ├── user_profiles.py # Profile CRUD, preferences, settings
│ └── permissions.py # RBAC, admin controls
├── product_catalog/ # 80K LOC - Product data, search, recommendations
│ ├── product_models.py # Complex product hierarchy, variants
│ ├── search_engine.py # Elasticsearch integration, faceted search
│ ├── recommendations.py # ML-based product recommendations
│ └── inventory.py # Real-time inventory tracking
├── order_processing/ # 120K LOC - Cart, checkout, fulfillment
│ ├── shopping_cart.py # Session-based cart, persistence
│ ├── checkout.py # Multi-step checkout, validation
│ ├── payment_processing.py # Multiple payment gateways, PCI compliance
│ ├── order_fulfillment.py # Warehouse integration, shipping
│ └── returns.py # Return processing, refunds
├── customer_service/ # 60K LOC - Support, chat, tickets
│ ├── support_tickets.py # Ticket management, SLA tracking
│ ├── live_chat.py # Real-time chat, agent routing
│ └── knowledge_base.py # FAQ, documentation system
├── analytics/ # 90K LOC - Reporting, BI, ML pipelines
│ ├── event_tracking.py # User behavior tracking, analytics
│ ├── reporting.py # Business intelligence, dashboards
│ └── ml_pipelines.py # Recommendation training, A/B testing
└── shared/ # 100K LOC - Common utilities, integrations
├── database.py # ORM, connection pooling, migrations
├── caching.py # Redis, memcached, cache strategies
├── messaging.py # RabbitMQ, event publishing
└── external_apis.py # Payment, shipping, tax integrations
```
**Current Performance Metrics**
- Response time: 200ms (p50), 800ms (p95), 2.5s (p99)
- Throughput: 5,000 requests/minute peak
- Database: PostgreSQL, 2TB data, 500 concurrent connections
- Infrastructure: 12 application servers, 3 database replicas
- Deployment: 45-minute deployments, 2-week release cycles
- Incidents: 15 production issues/month, 99.2% uptime
**Scaling Challenges**
- Database bottlenecks during flash sales (10x traffic spikes)
- Memory usage: 8GB per app server, frequent GC pauses
- Code coupling: Changes in one module often break others
- Testing: 6-hour test suite, flaky integration tests
- Team coordination: 25 developers, merge conflicts, blocking dependencies
**Business Requirements**
- Scale to 50,000 requests/minute (10x current)
- Support international expansion (5 new countries)
- Reduce deployment time to <10 minutes
- Improve developer velocity (faster feature delivery)
- Achieve 99.9% uptime SLA
- Support real-time features (live inventory, chat)
**Technology Constraints**
- Must maintain PCI DSS compliance
- Legacy integrations with 15 external systems
- Existing PostgreSQL data (complex migration)
- Team expertise: Strong Python, limited microservices experience
- Budget: $2M for infrastructure changes over 18 months
## Architecture Challenge ##
Evaluate whether TechFlow should migrate from their monolithic architecture to microservices, and if so, design the optimal migration strategy.
Provide architectural analysis covering:
1. **System-Wide Impact Assessment**: Analyze how microservices migration would affect each component. Consider data consistency requirements, transaction boundaries, and inter-service communication patterns. Identify which modules are good candidates for extraction and which should remain coupled.
2. **Performance Implications Analysis**: Model the performance impact of network latency, service-to-service calls, and distributed transactions. Compare current monolith performance with projected microservices performance under various load scenarios. Consider caching strategies, data locality, and query optimization.
3. **Security Architecture Review**: Evaluate security implications of distributed architecture. Analyze authentication/authorization across services, network security, data encryption in transit/at rest, and compliance requirements (PCI DSS). Identify new attack vectors and mitigation strategies.
4. **Dependency and Data Flow Analysis**: Map current tight coupling and shared database dependencies. Design service boundaries that minimize cross-service transactions. Analyze data consistency requirements and propose eventual consistency vs strong consistency trade-offs.
5. **Migration Strategy & Risk Assessment**: Design a phased migration approach that minimizes business disruption. Identify the optimal order for service extraction, considering dependencies and risk levels. Propose strangler fig pattern implementation with specific timelines and rollback strategies.
6. **Operational Complexity Evaluation**: Assess the operational overhead of managing 15-20 microservices vs 1 monolith. Consider monitoring, logging, debugging, deployment pipelines, and team structure changes. Evaluate DevOps tooling requirements and learning curve.
7. **Alternative Architecture Recommendations**: If microservices aren't optimal, propose alternative approaches (modular monolith, service-oriented architecture, domain-driven design within monolith). Compare trade-offs and provide decision framework.
Consider the team's current expertise, budget constraints, and business timeline. Provide specific recommendations for technology choices, infrastructure requirements, and organizational changes needed for success.
Ask clarifying questions if needed."""
# Prepare the request
request_body = {
"system": [
{
"text": system_prompt
}
],
"messages": [
{
"role": "user",
"content": [
{
"text": user_prompt
}
]
}
],
"toolConfig": {
"tools": [
{
"systemTool": {
"name": "nova_grounding"
}
},
{
"systemTool": {
"name": "nova_code_interpreter"
}
}
]
},
"additionalModelRequestFields": {
"reasoningConfig": {
"type": "enabled",
"maxReasoningEffort": "low"
}
},
"inferenceConfig": {
"temperature": 0.3,
"topP": 0.9,
"maxTokens": 10000
}
}
# Make the API call
response = bedrock_runtime.converse(
modelId="amazon.nova-2-lite-v1:0",
**request_body
)
# Print the response
print(json.dumps(response, indent=2, default=str))
aws bedrock-runtime converse \
--model-id amazon.nova-2-lite-v1:0 \
--system "You are an expert software architect who specializes in large-scale distributed systems and microservices architecture.\nEvaluate architectural decisions considering performance implications, security vulnerabilities, maintainability challenges, scalability constraints, and long-term technical debt across complex enterprise systems with millions of users." \
--messages "[{\"role\":\"user\",\"content\":[{\"text\":\"## System Architecture ##\n**Current Monolithic E-commerce Platform (500K LOC)**\n- Response time: 200ms (p50), 800ms (p95), 2.5s (p99)\n- Throughput: 5,000 requests/minute peak\n- Database: PostgreSQL, 2TB data, 500 concurrent connections\n- Team: 25 developers, merge conflicts, blocking dependencies\n\n**Business Requirements**\n- Scale to 50,000 requests/minute (10x current)\n- Support international expansion (5 new countries)\n- Reduce deployment time to <10 minutes\n- Achieve 99.9% uptime SLA\n\n## Architecture Challenge ##\nEvaluate whether TechFlow should migrate from monolithic architecture to microservices, and design optimal migration strategy.\n\nProvide analysis covering system-wide impact, performance implications, security architecture, dependency analysis, migration strategy, operational complexity, and alternative recommendations.\n\nAsk clarifying questions if needed.\"}]}]" \
--additional-model-request-fields "{\"reasoningConfig\":{\"type\":\"enabled\",\"maxReasoningEffort\":\"high\"}}" \
--inference-config "{\"temperature\":0.3,\"topP\":0.9,\"maxTokens\":10000}" \
--region us-west-2
{
"system": "You are an expert software architect who specializes in large-scale distributed systems and microservices architecture.\nEvaluate architectural decisions considering performance implications, security vulnerabilities, maintainability challenges, scalability constraints, and long-term technical debt across complex enterprise systems with millions of users.\n\n**Important formatting requirements:**\n- Use proper markdown tables with | separators and header rows\n- Format all tables as: | Column 1 | Column 2 | with |---|---| separator\n- Use bullet points and numbered lists for clarity\n- Include mermaid diagrams when helpful for visualization",
"messages": [
{
"role": "user",
"content": [
{
"text": "## System Architecture ##\n\n**Current Monolithic E-commerce Platform (500K LOC)**\n\n```python\n# Core monolith structure\necommerce_platform/\n\u251c\u2500\u2500 user_management/ # 50K LOC - User auth, profiles, preferences\n\u2502 \u251c\u2500\u2500 authentication.py # OAuth2, JWT, session management\n\u2502 \u251c\u2500\u2500 user_profiles.py # Profile CRUD, preferences, settings\n\u2502 \u2514\u2500\u2500 permissions.py # RBAC, admin controls\n\u251c\u2500\u2500 product_catalog/ # 80K LOC - Product data, search, recommendations\n\u2502 \u251c\u2500\u2500 product_models.py # Complex product hierarchy, variants\n\u2502 \u251c\u2500\u2500 search_engine.py # Elasticsearch integration, faceted search\n\u2502 \u251c\u2500\u2500 recommendations.py # ML-based product recommendations\n\u2502 \u2514\u2500\u2500 inventory.py # Real-time inventory tracking\n\u251c\u2500\u2500 order_processing/ # 120K LOC - Cart, checkout, fulfillment\n\u2502 \u251c\u2500\u2500 shopping_cart.py # Session-based cart, persistence\n\u2502 \u251c\u2500\u2500 checkout.py # Multi-step checkout, validation\n\u2502 \u251c\u2500\u2500 payment_processing.py # Multiple payment gateways, PCI compliance\n\u2502 \u251c\u2500\u2500 order_fulfillment.py # Warehouse integration, shipping\n\u2502 \u2514\u2500\u2500 returns.py # Return processing, refunds\n\u251c\u2500\u2500 customer_service/ # 60K LOC - Support, chat, tickets\n\u2502 \u251c\u2500\u2500 support_tickets.py # Ticket management, SLA tracking\n\u2502 \u251c\u2500\u2500 live_chat.py # Real-time chat, agent routing\n\u2502 \u2514\u2500\u2500 knowledge_base.py # FAQ, documentation system\n\u251c\u2500\u2500 analytics/ # 90K LOC - Reporting, BI, ML pipelines\n\u2502 \u251c\u2500\u2500 event_tracking.py # User behavior tracking, analytics\n\u2502 \u251c\u2500\u2500 reporting.py # Business intelligence, dashboards\n\u2502 \u2514\u2500\u2500 ml_pipelines.py # Recommendation training, A/B testing\n\u2514\u2500\u2500 shared/ # 100K LOC - Common utilities, integrations\n \u251c\u2500\u2500 database.py # ORM, connection pooling, migrations\n \u251c\u2500\u2500 caching.py # Redis, memcached, cache strategies\n \u251c\u2500\u2500 messaging.py # RabbitMQ, event publishing\n \u2514\u2500\u2500 external_apis.py # Payment, shipping, tax integrations\n ```\n\n**Current Performance Metrics**\n- Response time: 200ms (p50), 800ms (p95), 2.5s (p99)\n- Throughput: 5,000 requests/minute peak\n- Database: PostgreSQL, 2TB data, 500 concurrent connections\n- Infrastructure: 12 application servers, 3 database replicas\n- Deployment: 45-minute deployments, 2-week release cycles\n- Incidents: 15 production issues/month, 99.2% uptime\n\n**Scaling Challenges**\n- Database bottlenecks during flash sales (10x traffic spikes)\n- Memory usage: 8GB per app server, frequent GC pauses\n- Code coupling: Changes in one module often break others\n- Testing: 6-hour test suite, flaky integration tests\n- Team coordination: 25 developers, merge conflicts, blocking dependencies\n\n**Business Requirements**\n- Scale to 50,000 requests/minute (10x current)\n- Support international expansion (5 new countries)\n- Reduce deployment time to <10 minutes\n- Improve developer velocity (faster feature delivery)\n- Achieve 99.9% uptime SLA\n- Support real-time features (live inventory, chat)\n\n**Technology Constraints**\n- Must maintain PCI DSS compliance\n- Legacy integrations with 15 external systems\n- Existing PostgreSQL data (complex migration)\n- Team expertise: Strong Python, limited microservices experience\n- Budget: $2M for infrastructure changes over 18 months\n\n## Architecture Challenge ##\n\nEvaluate whether TechFlow should migrate from their monolithic architecture to microservices, and if so, design the optimal migration strategy.\n\nProvide architectural analysis covering:\n\n1. **System-Wide Impact Assessment**: Analyze how microservices migration would affect each component. Consider data consistency requirements, transaction boundaries, and inter-service communication patterns. Identify which modules are good candidates for extraction and which should remain coupled.\n\n2. **Performance Implications Analysis**: Model the performance impact of network latency, service-to-service calls, and distributed transactions. Compare current monolith performance with projected microservices performance under various load scenarios. Consider caching strategies, data locality, and query optimization.\n\n3. **Security Architecture Review**: Evaluate security implications of distributed architecture. Analyze authentication/authorization across services, network security, data encryption in transit/at rest, and compliance requirements (PCI DSS). Identify new attack vectors and mitigation strategies.\n\n4. **Dependency and Data Flow Analysis**: Map current tight coupling and shared database dependencies. Design service boundaries that minimize cross-service transactions. Analyze data consistency requirements and propose eventual consistency vs strong consistency trade-offs.\n\n5. **Migration Strategy & Risk Assessment**: Design a phased migration approach that minimizes business disruption. Identify the optimal order for service extraction, considering dependencies and risk levels. Propose strangler fig pattern implementation with specific timelines and rollback strategies.\n\n6. **Operational Complexity Evaluation**: Assess the operational overhead of managing 15-20 microservices vs 1 monolith. Consider monitoring, logging, debugging, deployment pipelines, and team structure changes. Evaluate DevOps tooling requirements and learning curve.\n\n7. **Alternative Architecture Recommendations**: If microservices aren't optimal, propose alternative approaches (modular monolith, service-oriented architecture, domain-driven design within monolith). Compare trade-offs and provide decision framework.\n\nConsider the team's current expertise, budget constraints, and business timeline. Provide specific recommendations for technology choices, infrastructure requirements, and organizational changes needed for success.\n\nAsk clarifying questions if needed."
}
]
}
],
"additionalModelRequestFields": {
"reasoningConfig": {
"type": "enabled",
"maxReasoningEffort": "low"
}
}
}