Troubleshooting
Troubleshooting
Section titled “Troubleshooting”Table of Contents
Section titled “Table of Contents”- Overview
- CloudFormation Deployment Failures
- Service Quotas
- Runtime Monitoring via CloudWatch
- Common Issues
- Deployment Fails on Nested Stack
- Meeting Stuck In Progress
- No Transcription Appearing
- Meeting Assistant Not Responding
- VP Fails to Join Meeting
- VP Stuck at MANUAL_ACTION_REQUIRED
- VP Stuck in INITIALIZING
- VP Stuck in WAITING_FOR_CAPACITY
- Zoom Join Blocked
- Voice Assistant Connection Issues
- MCP Server Installation Fails
- Cost Assessment
- Related Documentation
Overview
Section titled “Overview”This guide covers monitoring and troubleshooting for LMA deployments, including common deployment failures, runtime issues, and cost estimation.
CloudFormation Deployment Failures
Section titled “CloudFormation Deployment Failures”When a CloudFormation deployment fails:
- Check the Events tab on the failed stack in the CloudFormation console.
- Always navigate into failed nested stacks to find the root cause. The parent stack error is often generic; the nested stack Events tab contains the specific failure reason.
- Common causes include:
- Service quotas exceeded: Elastic IPs, NAT gateways, or other resource limits
- Insufficient IAM permissions: The deploying user or role lacks required permissions
- Bedrock model access not granted: You must explicitly enable model access in the Bedrock console before deployment
For additional guidance, see Troubleshooting CloudFormation in the AWS documentation.
Service Quotas
Section titled “Service Quotas”Be aware of the following service quota considerations:
- Amazon Transcribe: Default limit of 25 concurrent transcription streams. This directly limits the number of concurrent meetings. Request a quota increase through the AWS Service Quotas console if needed.
- Fargate tasks: Check your account limits for concurrent Fargate tasks.
- NAT gateways and Elastic IPs: Verify you have available capacity in the target region.
Runtime Monitoring via CloudWatch
Section titled “Runtime Monitoring via CloudWatch”Use the following paths to access logs for each LMA component:
- WebSocket Fargate task: ECS console > Clusters > LMA-WEBSOCKETTRANSCRIBERSTACK-xxxx-TranscribingCluster > Tasks > Logs > View in CloudWatch
- Call Event Processor Lambda: Lambda console > AISTACK-CallEventProcessor > Monitor > View logs in CloudWatch
- AppSync API: AppSync console > CallAnalytics-LMA > Monitoring > View logs in CloudWatch
- Step Functions: For VP scheduling issues, check the Step Functions execution history in the Step Functions console
Common Issues
Section titled “Common Issues”Deployment Fails on Nested Stack
Section titled “Deployment Fails on Nested Stack”Navigate to the specific failed nested stack and check its Events tab for the root cause. The parent stack typically shows a generic “nested stack failed” error that is not actionable on its own.
Meeting Stuck In Progress
Section titled “Meeting Stuck In Progress”The Virtual Participant ECS task may have crashed. This issue was addressed in v0.3.0 with automatic cleanup on uncaught errors. If a meeting remains stuck, you can manually end it by updating the meeting record in DynamoDB.
No Transcription Appearing
Section titled “No Transcription Appearing”- Check the WebSocket Fargate task logs for errors.
- Verify that audio is being streamed from the client.
- Check Amazon Transcribe service limits to ensure you have not exceeded the concurrent stream quota.
Meeting Assistant Not Responding
Section titled “Meeting Assistant Not Responding”- Check the Call Event Processor Lambda logs for errors.
- Verify that Bedrock model access has been granted in the Bedrock console for the configured model.
- Review the Strands agent logs for agent-specific errors.
VP Fails to Join Meeting
Section titled “VP Fails to Join Meeting”- Check Step Functions execution logs for scheduling or state machine errors.
- Check the ECS task logs for the specific VP task.
- Verify that the meeting URL and credentials are correct.
- Check for platform-specific issues (Zoom, Teams, Chime, etc.).
- Check the VP detail page for an
errorMessage. The VP writes a human-readable failure reason to the DDB record (e.g. “Meeting join failed: …”, “Zoom login failed: invalid credentials”, “ECS RunTask soft-failure: agent not connected”) and surfaces it on the detail page’s troubleshooting card.
VP Stuck at MANUAL_ACTION_REQUIRED
Section titled “VP Stuck at MANUAL_ACTION_REQUIRED”This status means the VP hit a CAPTCHA, 2FA prompt, SSO redirect, an unknown consent dialog, or another Zoom verification step that needs human input. To resolve:
- Open the meeting detail page for the affected VP (the Flashbar alert at the top of the LMA UI links directly to it).
- Open the Live Virtual Participant View panel and toggle View Only off so the noVNC viewer accepts your input.
- Complete the challenge as described in the on-screen banner — type the OTP, solve the CAPTCHA, sign in via SSO, click the consent button, etc.
- The VP detects the resolved state and continues automatically. The default timeout is 3 minutes; if no human response arrives, the VP fails the meeting cleanly with
errorMessageset to “Manual action timed out”.
To get notified about MANUAL_ACTION events when you’re not watching the LMA tab, grant the browser notification permission in the Web UI — the VP UI fires a desktop notification + audio chime when the status flips.
VP Stuck in INITIALIZING
Section titled “VP Stuck in INITIALIZING”Most often caused by an ECS soft-failure: RunTask returns HTTP 200 but with a non-empty failures array (typically “Container instance … agent connection lost” or “Insufficient memory available”). The state machine catches this explicitly in v0.3.4+ and writes FAILED to the VP record with the failure reason. If you’re on an older version, check the ECS console > Tasks > Stopped for the failure reason on the task that should have launched, and consider upgrading.
If the task did launch but the VP is stuck in INITIALIZING without progressing to BOOTING, ECS Container Insights → Memory Utilized for the VP task definition family will show whether the host is starved (memory ~95%+ usually means a too-small instance type — see the Virtual Participant docs for sizing guidance).
A different symptom — VP sat at INITIALIZING forever with no error — was caused by a MarkVPFailed cleanup-state DDB AttributeValue shape bug fixed in v0.3.4+. If you see this on a v0.3.3-or-earlier deployment, the corresponding ECS task will have stopped with RESOURCE:MEMORY or similar in the AWS Step Functions execution log; manually mark the VP FAILED in DynamoDB and upgrade.
VP Stuck in WAITING_FOR_CAPACITY
Section titled “VP Stuck in WAITING_FOR_CAPACITY”This means the cluster is full and the capacity-provider auto-scaler is launching a new EC2 host. Expect 60-90 seconds before the status transitions to BOOTING. If it stays at WAITING_FOR_CAPACITY longer:
- Check the ASG console for
LMA-*-VP-ASG. IfDesiredCapacityis already atVPMaxInstances, raise the parameter or wait for an active VP to finish. - Check ASG Activity history for launch failures (e.g. “Insufficient capacity in availability zone”). Switch to a different instance type via
VPInstanceTypeif the current one is unavailable in your region. - Check the EC2 console for new instances in
pendingstate — userdata bootstrap (yum security update, ECR docker pull) takes ~30 seconds, then the ECS agent registers.
Zoom Join Blocked
Section titled “Zoom Join Blocked”Symptom: the VP fails to join with the Zoom dialog “We detected you may be a bot. Automated bots aren’t allowed to join this meeting or webinar…”.
Resolutions, in order of effectiveness:
- Add Zoom credentials to LMA (per-user, opt-in) — open the Create Virtual Participant modal and use the Zoom account card. A signed-in session joins far more reliably. See Zoom Sign-in & Join Reliability.
- Sign in to Zoom on your laptop with the same account at least once before relying on LMA. A brand-new account whose only activity is joining from AWS IP ranges is more likely to hit this block.
- Remove and re-save the credentials if you previously saved them and now hit blocks — this also wipes the user’s persisted Chromium profile in S3, clearing any stale cookies.
The residual factor is AWS egress IP reputation, which cannot be fully controlled from inside the container. If many users in your org see joins blocked even when signed in, route VP egress through a NAT or residential-proxy provider.
Voice Assistant Connection Issues
Section titled “Voice Assistant Connection Issues”Check the VP task logs for WebSocket or Bedrock session errors. Voice assistant sessions auto-refresh, but may occasionally require a manual restart of the meeting. Persistent issues may indicate network connectivity problems or service disruptions.
MCP Server Installation Fails
Section titled “MCP Server Installation Fails”- Check the CodeBuild logs for build and installation errors.
- Verify that the MCP server package is compatible with the LMA environment.
- Note that there is a maximum of 5 MCP servers per account.
Cost Assessment
Section titled “Cost Assessment”LMA costs depend on usage patterns and configuration. The following are approximate estimates:
- Base infrastructure: ~$10/month (Fargate WebSocket server at 0.25 vCPU + VPC networking)
- VP EC2 instances:
$30/month per warm$60/month) for 3-VPs-per-host density. Sett3.mediuminstance (default; capacity-provider scales 1→VPMaxInstanceson demand). BumpVPInstanceTypetot3.large(VPMinInstances=0to scale down to zero when idle and pay only when a VP is requested (~60-90s cold-start added to first VP). - Per-meeting usage: ~$0.17 per 5-minute call (varies based on options selected)
Key AWS service pricing pages for detailed cost estimation:
- Amazon Transcribe Pricing
- Amazon Bedrock Pricing
- Amazon Translate Pricing
- AWS AppSync Pricing
- Amazon DynamoDB Pricing
- AWS Lambda Pricing
- Amazon S3 Pricing
- Amazon Cognito Pricing
Use AWS Cost Explorer or Bill Details in the AWS Billing console for actual spend tracking.