Skip to content

AWS Nova Sonic 2 Voice Assistant Setup Guide

AWS Nova Sonic 2 Voice Assistant Setup Guide

Section titled “AWS Nova Sonic 2 Voice Assistant Setup Guide”

This guide walks you through setting up the AWS Nova Sonic 2 voice assistant with Strands agent integration in LMA. AWS Nova Sonic 2 provides real-time voice interaction during meetings with access to meeting history, document search, web search, and custom MCP integrations through the Strands agent tool.

  • LMA deployed (version 0.2.26 and above)
  • AWS account with Bedrock access
  • AWS Nova Sonic 2 model access enabled in your region
  • Native AWS Integration: No external API keys or third-party services required
  • Bidirectional Streaming: Real-time audio processing with low latency
  • Built-in Tool Use: Native support for tool calling and function execution
  • Cost Effective: Pay only for what you use with AWS pricing
  • Secure: All data stays within your AWS environment
  • Async Tool Processing: Tools execute in background without blocking conversation
  • Pre-Tool Acknowledgment: Announces tool execution to set user expectations
  • Customizable Config: Edit system prompts and voice ID via DynamoDB

Step 1: Enable AWS Nova Sonic 2 Model Access

Section titled “Step 1: Enable AWS Nova Sonic 2 Model Access”
  1. Go to AWS Bedrock console
  2. Navigate to Model access in the left sidebar
  3. Verify AWS Nova Sonic 2 is available in your region
  4. If not available, request access or use a supported region
  1. Click Manage model access
  2. Find AWS Nova Sonic 2 in the list
  3. Check the box to enable access
  4. Click Save changes
  5. Wait for access to be granted (usually immediate)

Step 2: Deploy LMA with AWS Nova Sonic 2 Configuration

Section titled “Step 2: Deploy LMA with AWS Nova Sonic 2 Configuration”

When deploying or updating your LMA stack, set these parameters:

ParameterValueDescription
VoiceAssistantProvideramazon_nova_sonicEnable AWS Nova Sonic 2 voice assistant
VoiceAssistantActivationModewake_phrase or always_activeChoose activation mode

That’s it! AWS Nova Sonic 2 requires no API keys or external configuration.

Wake Phrase Mode (Recommended)

  • Agent activates when user says “Hey Alex”
  • Saves costs by only connecting when needed
  • Automatically disconnects after configurable timeout (default: 30 seconds)
  • Set: VoiceAssistantActivationMode=wake_phrase

Always Active Mode

  • Agent is always listening
  • Responds immediately without wake phrase
  • Higher costs (continuous connection)
  • Set: VoiceAssistantActivationMode=always_active
ParameterDefaultDescription
VoiceAssistantActivationDuration30Seconds to stay active after wake phrase
VoiceAssistantWakePhraseshey alexComma-separated wake phrases
  1. Go to AWS CloudFormation console
  2. Find your LMA stack
  3. Go to the Outputs tab
  4. Verify these outputs exist:
    • VoiceAssistantProvider: Should show amazon_nova_sonic
    • StrandsLambdaArn: Should show the Lambda ARN
    • DefaultNovaSonicConfig: Link to view default configuration
    • CustomNovaSonicConfig: Link to edit custom configuration
  1. Start or join a meeting with the virtual participant
  2. Say: “Hey Alex, are you there?”
  3. The agent should respond with voice
  4. Try: “Hey Alex, what were the action items from our last meeting?”
  5. The agent should:
    • Acknowledge: “Let me search for that information. This may take a moment.”
    • Call the strands_agent tool in the background
    • Return results from the Strands Lambda
    • Speak the response naturally

Step 4: Customize Voice Assistant (Optional)

Section titled “Step 4: Customize Voice Assistant (Optional)”

After stack deployment, you’ll find two console links in the CloudFormation outputs:

  1. DefaultNovaSonicConfig - View the default configuration (read-only, do not edit)
  2. CustomNovaSonicConfig - Edit your custom configuration (preserved during stack updates)

Click the CustomNovaSonicConfig link to open the DynamoDB console.

You can customize these settings in the CustomNovaSonicConfig item:

AttributeTypeDescriptionExample
systemPromptStringCustom system prompt for the voice assistant”You are Jamie, a friendly meeting assistant”
promptModeStringHow to apply custom prompt: base, inject, or replaceinject
voiceIdStringNova Sonic voice ID (see available voices below)tiffany
modelIdStringBedrock model IDamazon.nova-sonic-v1:0
endpointingSensitivityStringTurn-taking sensitivity: HIGH, MEDIUM, or LOWMEDIUM
groupMeetingModeBooleanEnable passive mode for group meetings (default: false)true

Base Mode (Simple Replacement)

  • Uses your custom prompt as-is
  • Falls back to default if no custom prompt provided
  • Use when: You want complete control over the prompt

Inject Mode (Append to Default)

  • Appends your custom instructions to the default prompt
  • Preserves default behavior + adds your customizations
  • Use when: You want to add extra instructions without losing defaults

Replace Mode (Complete Override)

  • Completely replaces the default prompt with yours
  • Ignores default prompt entirely
  • Use when: You need fundamentally different assistant behavior

The endpointingSensitivity parameter controls how quickly Nova Sonic detects the end of a user’s turn and begins responding. This affects both response latency and the likelihood of interrupting users who are still speaking.

Available Values:

SensitivityPause DurationBest For
HIGH1.5 secondsQuick Q&A, command-and-control, time-sensitive interactions
MEDIUM (default)1.75 secondsGeneral conversations, customer service, multi-turn discussions
LOW2.0 secondsThoughtful conversations, elderly or speech-impaired users, complex problem-solving

How It Works:

  • Nova Sonic waits for the specified pause duration after detecting the end of speech before responding
  • Higher sensitivity = faster responses but more risk of interrupting users who pause while thinking
  • Lower sensitivity = more patient waiting but slightly slower responses

When to Adjust:

  • Use HIGH for fast-paced interactions where users expect immediate responses
  • Use MEDIUM (default) for balanced, natural conversations
  • Use LOW when users need more time to formulate thoughts or have speech patterns with longer pauses

4.5 Group Meeting Mode (Passive Listening)

Section titled “4.5 Group Meeting Mode (Passive Listening)”

The groupMeetingMode parameter enables Nova to listen passively in group meetings and only respond when directly addressed. This is ideal for multi-participant meetings where you want the assistant available but not interrupting conversations between other participants.

How It Works:

  • Nova starts muted (audio output disabled)
  • Listens to all conversation silently
  • Only responds when someone mentions “Alex” in their speech
  • Automatically calls unmute tool before speaking
  • Auto-mutes after finishing response

Configuration:

{
"groupMeetingMode": true,
"endpointingSensitivity": "LOW"
}

Benefits:

  • Non-intrusive - Won’t interrupt conversations between participants
  • Always available - Listening and ready when needed
  • Natural interaction - Just say “Alex” to get attention
  • Barge-in support - Can interrupt Nova mid-sentence if needed
  • No feedback loops - Separate audio routing prevents echo

When to Use:

  • Multi-participant meetings (3+ people)
  • Team discussions where assistant is optional
  • Meetings where you want assistant available but not active
  • Scenarios where interruptions would be disruptive

Comparison with Wake Phrase Mode:

FeatureGroup Meeting ModeWake Phrase Mode
SessionAlways connectedConnects on wake phrase
ListeningContinuousOnly when activated
ResponseWhen “Alex” mentionedAfter wake phrase + timeout
CostHigher (always connected)Lower (connects on demand)
Use CaseGroup meetings1-on-1 or cost-sensitive
Barge-inSupportedNot applicable

Amazon Nova Sonic supports 16 different voices:

Polyglot Voices (Multiple Languages):

  • tiffany (default) - Female, warm and professional
  • matthew - Male, clear and authoritative

English Voices:

  • amy - British English, professional
  • olivia - American English, friendly
  • kiara - Indian English, clear

Other Languages:

  • arjun - Hindi
  • ambre - French (France)
  • florian - French (Canada)
  • beatrice - Italian
  • lorenzo - Italian
  • tina - German
  • lennart - German
  • lupe - Spanish (Spain)
  • carlos - Spanish (Latin America)
  • carolina - Portuguese (Brazil)
  • leo - Portuguese (Portugal)

Example 1: Professional Assistant (Fast Response)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a professional executive assistant. Provide concise, actionable responses. Always confirm understanding before taking action.",
"promptMode": "replace",
"voiceId": "matthew",
"endpointingSensitivity": "HIGH"
}

Example 2: Friendly Team Assistant

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "Always end responses with 'Anything else I can help with?' to encourage engagement.",
"promptMode": "inject",
"voiceId": "olivia",
"endpointingSensitivity": "MEDIUM"
}

Example 3: Technical Support (Patient Listening)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a technical support specialist for AWS services. Provide detailed, accurate information. Use technical terminology when appropriate.",
"promptMode": "replace",
"voiceId": "tiffany",
"endpointingSensitivity": "LOW"
}

Example 4: Group Meeting Assistant (Passive Mode)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a helpful meeting assistant. Provide concise, relevant information when asked.",
"promptMode": "base",
"voiceId": "tiffany",
"endpointingSensitivity": "LOW",
"groupMeetingMode": true
}

Example 5: Accessibility-Focused (Maximum Patience)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a patient, supportive assistant. Speak slowly and clearly. Wait for users to finish their thoughts completely.",
"promptMode": "replace",
"voiceId": "amy",
"endpointingSensitivity": "LOW"
}
  1. Edit the CustomNovaSonicConfig item in DynamoDB
  2. Save your changes
  3. No redeployment needed - changes take effect immediately
  4. Start a new virtual participant to test

View logs in CloudWatch:

  1. Go to AWS CloudWatch console
  2. Navigate to Log Groups
  3. Find the log group for your virtual participant
  4. Look for these log messages:
    ✓ AWS Nova Sonic 2 voice assistant enabled
    ✓ Strands agent tool configured
    🎤 Wake phrase detected: "hey alex"
    🔧 Tool call received: strands_agent
    ⏳ Pre-tool acknowledgment: "Let me search for that information..."
    🔄 Async tool execution started
    ✓ Tool result ready, streaming response
  1. Go to AWS CloudWatch console
  2. Navigate to MetricsBedrock
  3. View Nova Sonic 2 usage metrics:
    • Invocations
    • Audio duration
    • Tool calls
    • Errors

Issue: Agent Not Responding to Wake Phrase

Section titled “Issue: Agent Not Responding to Wake Phrase”

Cause: Wake phrase detection not working

Solution:

  • Speak clearly: “Hey Alex” (pause) “your question”
  • Check microphone is working in the meeting
  • Try: “Hey Alex, are you there?” as a simple test
  • Verify VoiceAssistantActivationMode=wake_phrase is set
  • Check CloudWatch logs for wake phrase detection

Cause: AWS Nova Sonic 2 model access not enabled

Solution:

  • Go to Bedrock console → Model access
  • Enable AWS Nova Sonic 2 model
  • Wait for access to be granted
  • Verify your region supports Nova Sonic 2

Cause: Strands agent taking too long to respond

Solution:

  • Check Strands Lambda execution time in CloudWatch
  • Verify Lambda has sufficient memory and timeout
  • Check if knowledge base queries are slow
  • Review async tool processing logs

Cause: Audio configuration issue

Solution:

  • Check virtual participant audio setup
  • Verify PulseAudio is running
  • Review paplay logs in CloudWatch
  • Check microphone unmute status

Cause: Session management issue (fixed in v0.2.27)

Solution:

  • Upgrade to LMA v0.2.27 or later
  • Session now stays open during tool execution
  • Async processing prevents blocking
User Speech
Wake Phrase Detection ("Hey Alex")
AWS Nova Sonic 2 Activation
User Query → Nova Sonic 2
Nova Decides to Use strands_agent Tool
Pre-Tool Acknowledgment: "Let me search for that information..."
Async Tool Execution (Non-Blocking)
Invoke Strands Lambda
Lambda Response → Nova Sonic 2
Nova Processes Result
Voice Response to User
  • AWS Nova Sonic 2: Handles voice conversation, tool decisions, and bidirectional streaming
  • LMA Backend: Manages tool execution and Lambda invocation
  • Strands Lambda: Provides access to meeting history, documents, web search, MCP integrations
  • Virtual Participant: Captures audio and plays responses in meeting

Once configured, the voice assistant can:

  • Example: “What were the action items from our last meeting?”
  • Example: “What did we discuss in yesterday’s meeting?”
  • Example: “Find meetings with John from last week”
  • Example: “Search our documents for project requirements”
  • Example: “What does our policy say about remote work?”
  • Example: “Find the latest product roadmap”
  • Example: “What’s the latest news about AI voice agents?”
  • Example: “Search for AWS Nova Sonic 2 documentation”
  • Example: “What are the best practices for MCP integrations?”
  • Example: “Look up the Amazon account in Salesforce”
  • Example: “What’s the contract status for Acme Corp?”
  • Example: “Show me recent opportunities”
  • Example: “Check Slack for recent messages in the team channel”
  • Example: “What did Bob say about the demo?”
  • Example: “Any urgent messages in Slack?”
  • Example: “Check Jira for open tickets”
  • Example: “Look up GitHub issues”
  • Example: “Query our internal database”
  • All processing happens within AWS
  • No third-party API keys to manage
  • No data leaves your AWS environment
  • IAM-based access control
  • AWS handles authentication automatically
  • Bedrock sessions are encrypted
  • Connections close automatically after inactivity
  • No persistent storage of conversation data
  • Lambda execution role has minimal required permissions
  • Strands Lambda access controlled by IAM
  • Virtual participant isolated per meeting
  • No cross-meeting data access
  • Recommended for most use cases
  • Only connects when activated
  • Automatically disconnects after timeout
  • Saves ~90% of connection costs
  • Use only when immediate response is critical
  • Continuous Bedrock connection
  • Higher costs
  • Consider for high-priority meetings only
  • Tools execute in background (v0.2.27+)
  • Nova stays responsive during tool execution
  • Better user experience
  • No additional cost
  • Pay only for audio duration
  • Monitor usage in CloudWatch metrics
  • Set budget alerts in AWS Budgets
  • Consider wake phrase mode for cost savings

Set multiple wake phrases:

Terminal window
VoiceAssistantWakePhrases="hey alex,ok alex,hi alex,hello alex"

Set longer activation for complex queries:

Terminal window
VoiceAssistantActivationDuration=60 # 60 seconds
  • Session stays open during tool use and audio playback
  • No more premature disconnections
  • Smoother conversation flow
  • Tools execute in background without blocking
  • Nova remains responsive during tool execution
  • Better user experience for complex queries
  • Nova announces: “Let me search for that information. This may take a moment.”
  • Sets proper user expectations
  • Confirmation-based prompting strategy

Comparison: AWS Nova Sonic 2 vs ElevenLabs

Section titled “Comparison: AWS Nova Sonic 2 vs ElevenLabs”
FeatureAWS Nova Sonic 2ElevenLabs
Setup ComplexitySimple (2 parameters)Moderate (API key, agent config)
External DependenciesNoneElevenLabs account required
API KeysNot requiredRequired
Data LocationStays in AWSSent to ElevenLabs
Tool UseNative Bedrock tool useClient-side tool calling
Async ProcessingYes (v0.2.27+)Depends on configuration
Cost ModelAWS Bedrock pricingElevenLabs pricing
Voice QualityHigh qualityVery high quality
CustomizationSystem prompts & 16 voice IDs via DynamoDBExtensive via API

For issues specific to:

  • LMA Voice Assistant: Check LMA documentation and CloudWatch logs
  • AWS Nova Sonic 2: Check AWS Bedrock documentation
  • Strands Agent: Check Strands Lambda logs in CloudWatch
  • Tool Configuration: Check LMA GitHub repository

What You Get:

  • Real-time voice assistant in meetings
  • Access to meeting history and documents
  • Web search and MCP integrations
  • Natural conversation with AI
  • Automatic tool invocation
  • Async tool processing (v0.2.27+)

What You Need:

  • AWS account with Bedrock access
  • AWS Nova Sonic 2 model enabled
  • 2 CloudFormation parameters (amazon_nova_sonic + activation mode)
  • No external API keys

Key Features:

  • Wake phrase activation (“Hey Alex”)
  • Native AWS Bedrock tool use
  • Automatic Lambda invocation
  • Secure (all in AWS)
  • Cost-optimized with wake phrase mode
  • Pre-tool acknowledgment
  • Async tool execution
  • Customizable prompts and voices via DynamoDB (no code changes needed)

Customization:

  • 3 prompt modes (base, inject, replace)
  • 16 voice IDs to choose from
  • 3 turn-taking sensitivity levels (HIGH, MEDIUM, LOW)
  • Group meeting mode for passive listening
  • Barge-in support (interrupt Nova mid-sentence)
  • Edit via DynamoDB console
  • Changes apply immediately (no redeployment)

That’s it! Your meetings now have an AI voice assistant powered by AWS Nova Sonic 2 with access to your organization’s knowledge and systems, fully customizable to match your needs - from active 1-on-1 conversations to passive group meeting support!