Skip to content

AWS Nova Sonic 2 Voice Assistant Setup Guide

AWS Nova Sonic 2 Voice Assistant Setup Guide

Section titled “AWS Nova Sonic 2 Voice Assistant Setup Guide”

This guide walks you through setting up the AWS Nova Sonic 2 voice assistant with Strands agent integration in LMA. AWS Nova Sonic 2 provides real-time voice interaction during meetings with access to meeting history, document search, web search, and custom MCP integrations through the Strands agent tool.

  • LMA deployed (version 0.2.26 and above)
  • AWS account with Bedrock access
  • AWS Nova Sonic 2 model access enabled in your region
  • Native AWS Integration: No external API keys or third-party services required
  • Bidirectional Streaming: Real-time audio processing with low latency
  • Built-in Tool Use: Native support for tool calling and function execution
  • Cost Effective: Pay only for what you use with AWS pricing
  • Secure: All data stays within your AWS environment
  • Async Tool Processing: Tools execute in background without blocking conversation
  • Pre-Tool Acknowledgment: Announces tool execution to set user expectations
  • Customizable Config: Edit system prompts and voice ID via DynamoDB

Step 1: Enable AWS Nova Sonic 2 Model Access

Section titled “Step 1: Enable AWS Nova Sonic 2 Model Access”
  1. Go to AWS Bedrock console
  2. Navigate to Model access in the left sidebar
  3. Verify AWS Nova Sonic 2 is available in your region
  4. If not available, request access or use a supported region
  1. Click Manage model access
  2. Find AWS Nova Sonic 2 in the list
  3. Check the box to enable access
  4. Click Save changes
  5. Wait for access to be granted (usually immediate)

Step 2: Deploy LMA with AWS Nova Sonic 2 Configuration

Section titled “Step 2: Deploy LMA with AWS Nova Sonic 2 Configuration”

When deploying or updating your LMA stack, set these parameters:

ParameterValueDescription
VoiceAssistantProvideramazon_nova_sonicEnable AWS Nova Sonic 2 voice assistant
VoiceAssistantActivationModewake_phrase or always_activeChoose activation mode

That’s it! AWS Nova Sonic 2 requires no API keys or external configuration.

Wake Phrase Mode (Recommended)

  • Agent activates when user says “Hey Alex”
  • Saves costs by only connecting when needed
  • Automatically disconnects after configurable timeout (default: 30 seconds)
  • Set: VoiceAssistantActivationMode=wake_phrase

Always Active Mode

  • Agent is always listening
  • Responds immediately without wake phrase
  • Higher costs (continuous connection)
  • Set: VoiceAssistantActivationMode=always_active
ParameterDefaultDescription
VoiceAssistantActivationDuration30Seconds to stay active after wake phrase
VoiceAssistantWakePhraseshey alexComma-separated wake phrases
  1. Go to AWS CloudFormation console
  2. Find your LMA stack
  3. Go to the Outputs tab
  4. Verify these outputs exist:
    • VoiceAssistantProvider: Should show amazon_nova_sonic
    • StrandsLambdaArn: Should show the Lambda ARN
    • DefaultNovaSonicConfig: Link to view default configuration
    • CustomNovaSonicConfig: Link to edit custom configuration
  1. Start or join a meeting with the virtual participant
  2. Say: “Hey Alex, are you there?”
  3. The agent should respond with voice
  4. Try: “Hey Alex, what were the action items from our last meeting?”
  5. The agent should:
    • Acknowledge: “Let me search for that information. This may take a moment.”
    • Call the strands_agent tool in the background
    • Return results from the Strands Lambda
    • Speak the response naturally

Step 4: Customize Voice Assistant (Optional)

Section titled “Step 4: Customize Voice Assistant (Optional)”

After stack deployment, you’ll find two console links in the CloudFormation outputs:

  1. DefaultNovaSonicConfig - View the default configuration (read-only, do not edit)
  2. CustomNovaSonicConfig - Edit your custom configuration (preserved during stack updates)

Click the CustomNovaSonicConfig link to open the DynamoDB console.

You can customize these settings in the CustomNovaSonicConfig item:

AttributeTypeDescriptionExample
systemPromptStringCustom system prompt for the voice assistant”You are Jamie, a friendly meeting assistant”
promptModeStringHow to apply custom prompt: base, inject, or replaceinject
voiceIdStringNova Sonic voice ID (see available voices below)tiffany
modelIdStringBedrock model IDamazon.nova-sonic-v1:0
endpointingSensitivityStringTurn-taking sensitivity: HIGH, MEDIUM, or LOWMEDIUM
meetingModeStringMeeting behavior mode: normal, group, or translator (default: normal)group
translatorLanguageAStringFirst language for Translator Mode (default: English)English
translatorLanguageBStringSecond language for Translator Mode (default: Spanish)Spanish

Base Mode (Simple Replacement)

  • Uses your custom prompt as-is
  • Falls back to default if no custom prompt provided
  • Use when: You want complete control over the prompt

Inject Mode (Append to Default)

  • Appends your custom instructions to the default prompt
  • Preserves default behavior + adds your customizations
  • Use when: You want to add extra instructions without losing defaults

Replace Mode (Complete Override)

  • Completely replaces the default prompt with yours
  • Ignores default prompt entirely
  • Use when: You need fundamentally different assistant behavior

The endpointingSensitivity parameter controls how quickly Nova Sonic detects the end of a user’s turn and begins responding. This affects both response latency and the likelihood of interrupting users who are still speaking.

Available Values:

SensitivityPause DurationBest For
HIGH1.5 secondsQuick Q&A, command-and-control, time-sensitive interactions
MEDIUM (default)1.75 secondsGeneral conversations, customer service, multi-turn discussions
LOW2.0 secondsThoughtful conversations, elderly or speech-impaired users, complex problem-solving

How It Works:

  • Nova Sonic waits for the specified pause duration after detecting the end of speech before responding
  • Higher sensitivity = faster responses but more risk of interrupting users who pause while thinking
  • Lower sensitivity = more patient waiting but slightly slower responses

When to Adjust:

  • Use HIGH for fast-paced interactions where users expect immediate responses
  • Use MEDIUM (default) for balanced, natural conversations
  • Use LOW when users need more time to formulate thoughts or have speech patterns with longer pauses

The meetingMode field controls how the voice assistant behaves in a meeting. There are three modes:

ModeValueDescription
Normalnormal (default)Standard assistant — responds to wake phrase or always-active
Group MeetinggroupPassive listening; only speaks when “Alex” is mentioned
TranslatortranslatorReal-time bidirectional interpreter between two languages

Requires VoiceAssistantActivationMode = always_active for group and translator modes to take effect. The Meeting Mode selector in the Nova Sonic Configuration UI is disabled when the deployment uses wake_phrase mode.


Group Meeting Mode enables Nova to listen passively and only respond when directly addressed. This is ideal for multi-participant meetings where you want the assistant available but not interrupting conversations.

How It Works:

  • Nova starts muted (audio output disabled)
  • Listens to all conversation silently
  • Only responds when someone mentions “Alex” in their speech
  • Automatically unmutes before speaking, re-mutes after

Configuration:

{
"meetingMode": "group",
"endpointingSensitivity": "LOW"
}

Benefits:

  • Non-intrusive — Won’t interrupt conversations between participants
  • Always available — Listening and ready when needed
  • Natural interaction — Just say “Alex” to get attention
  • Barge-in support — Can interrupt Nova mid-sentence if needed

When to Use:

  • Multi-participant meetings (3+ people)
  • Team discussions where assistant is optional
  • Scenarios where interruptions would be disruptive

Translator Mode (Real-Time Bidirectional Interpretation)

Section titled “Translator Mode (Real-Time Bidirectional Interpretation)”

Translator Mode turns the voice assistant into a live interpreter. It listens to both parties in a meeting and speaks each utterance back in the other language — no human interpreter required.

How It Works:

  • Nova listens continuously (always unmuted, no wake-phrase gate)
  • Detects the language of each utterance
  • Translates and speaks the response in the other language
  • Tools are disabled — the assistant focuses entirely on interpretation

Configuration:

{
"meetingMode": "translator",
"translatorLanguageA": "English",
"translatorLanguageB": "Spanish"
}

Supported Languages: Any language supported by Amazon Nova Sonic polyglot voices, including English, Spanish, French, German, Italian, Portuguese, Hindi, Japanese, Korean, Mandarin Chinese, Arabic, and more.

Benefits:

  • No human interpreter needed — AI handles both directions live
  • Natural voices — Nova Sonic polyglot voices sound human
  • Live transcript — Every utterance and translation is captured
  • AI summary — Post-meeting summary in your preferred language

When to Use:

  • Meetings with participants who speak different languages
  • International partner calls, customer support, cross-border negotiations
  • Any scenario where a live interpreter would otherwise be required

Comparison of all modes:

FeatureNormalGroup MeetingTranslator
ActivationWake phrase / always-activeAlways-active requiredAlways-active required
Mute behaviorPer activation modeStarts muted, unmutes on “Alex”Always unmuted
ToolsFull Strands toolsFull Strands toolsDisabled
Use case1-on-1 assistantMulti-person meetingsMultilingual meetings

Amazon Nova Sonic supports 16 different voices:

Polyglot Voices (Multiple Languages):

  • tiffany (default) - Female, warm and professional
  • matthew - Male, clear and authoritative

English Voices:

  • amy - British English, professional
  • olivia - American English, friendly
  • kiara - Indian English, clear

Other Languages:

  • arjun - Hindi
  • ambre - French (France)
  • florian - French (Canada)
  • beatrice - Italian
  • lorenzo - Italian
  • tina - German
  • lennart - German
  • lupe - Spanish (Spain)
  • carlos - Spanish (Latin America)
  • carolina - Portuguese (Brazil)
  • leo - Portuguese (Portugal)

Example 1: Professional Assistant (Fast Response)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a professional executive assistant. Provide concise, actionable responses. Always confirm understanding before taking action.",
"promptMode": "replace",
"voiceId": "matthew",
"endpointingSensitivity": "HIGH"
}

Example 2: Friendly Team Assistant

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "Always end responses with 'Anything else I can help with?' to encourage engagement.",
"promptMode": "inject",
"voiceId": "olivia",
"endpointingSensitivity": "MEDIUM"
}

Example 3: Technical Support (Patient Listening)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a technical support specialist for AWS services. Provide detailed, accurate information. Use technical terminology when appropriate.",
"promptMode": "replace",
"voiceId": "tiffany",
"endpointingSensitivity": "LOW"
}

Example 4: Group Meeting Assistant (Passive Mode)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a helpful meeting assistant. Provide concise, relevant information when asked.",
"promptMode": "base",
"voiceId": "tiffany",
"endpointingSensitivity": "LOW",
"meetingMode": "group"
}

Example 5: Live Translator (English ↔ Spanish)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"meetingMode": "translator",
"translatorLanguageA": "English",
"translatorLanguageB": "Spanish",
"voiceId": "tiffany"
}

Example 6: Accessibility-Focused (Maximum Patience)

{
"NovaSonicConfigId": "CustomNovaSonicConfig",
"systemPrompt": "You are a patient, supportive assistant. Speak slowly and clearly. Wait for users to finish their thoughts completely.",
"promptMode": "replace",
"voiceId": "amy",
"endpointingSensitivity": "LOW"
}
  1. Edit the CustomNovaSonicConfig item in DynamoDB
  2. Save your changes
  3. No redeployment needed - changes take effect immediately
  4. Start a new virtual participant to test

View logs in CloudWatch:

  1. Go to AWS CloudWatch console
  2. Navigate to Log Groups
  3. Find the log group for your virtual participant
  4. Look for these log messages:
    ✓ AWS Nova Sonic 2 voice assistant enabled
    ✓ Strands agent tool configured
    🎤 Wake phrase detected: "hey alex"
    🔧 Tool call received: strands_agent
    ⏳ Pre-tool acknowledgment: "Let me search for that information..."
    🔄 Async tool execution started
    ✓ Tool result ready, streaming response
  1. Go to AWS CloudWatch console
  2. Navigate to MetricsBedrock
  3. View Nova Sonic 2 usage metrics:
    • Invocations
    • Audio duration
    • Tool calls
    • Errors

Issue: Agent Not Responding to Wake Phrase

Section titled “Issue: Agent Not Responding to Wake Phrase”

Cause: Wake phrase detection not working

Solution:

  • Speak clearly: “Hey Alex” (pause) “your question”
  • Check microphone is working in the meeting
  • Try: “Hey Alex, are you there?” as a simple test
  • Verify VoiceAssistantActivationMode=wake_phrase is set
  • Check CloudWatch logs for wake phrase detection

Cause: AWS Nova Sonic 2 model access not enabled

Solution:

  • Go to Bedrock console → Model access
  • Enable AWS Nova Sonic 2 model
  • Wait for access to be granted
  • Verify your region supports Nova Sonic 2

Cause: Strands agent taking too long to respond

Solution:

  • Check Strands Lambda execution time in CloudWatch
  • Verify Lambda has sufficient memory and timeout
  • Check if knowledge base queries are slow
  • Review async tool processing logs

Cause: Audio configuration issue

Solution:

  • Check virtual participant audio setup
  • Verify PulseAudio is running
  • Review paplay logs in CloudWatch
  • Check microphone unmute status

Cause: Session management issue (fixed in v0.2.27)

Solution:

  • Upgrade to LMA v0.2.27 or later
  • Session now stays open during tool execution
  • Async processing prevents blocking
User Speech
Wake Phrase Detection ("Hey Alex")
AWS Nova Sonic 2 Activation
User Query → Nova Sonic 2
Nova Decides to Use strands_agent Tool
Pre-Tool Acknowledgment: "Let me search for that information..."
Async Tool Execution (Non-Blocking)
Invoke Strands Lambda
Lambda Response → Nova Sonic 2
Nova Processes Result
Voice Response to User
  • AWS Nova Sonic 2: Handles voice conversation, tool decisions, and bidirectional streaming
  • LMA Backend: Manages tool execution and Lambda invocation
  • Strands Lambda: Provides access to meeting history, documents, web search, MCP integrations
  • Virtual Participant: Captures audio and plays responses in meeting

Once configured, the voice assistant can:

  • Example: “What were the action items from our last meeting?”
  • Example: “What did we discuss in yesterday’s meeting?”
  • Example: “Find meetings with John from last week”
  • Example: “Search our documents for project requirements”
  • Example: “What does our policy say about remote work?”
  • Example: “Find the latest product roadmap”
  • Example: “What’s the latest news about AI voice agents?”
  • Example: “Search for AWS Nova Sonic 2 documentation”
  • Example: “What are the best practices for MCP integrations?”
  • Example: “Look up the Amazon account in Salesforce”
  • Example: “What’s the contract status for Acme Corp?”
  • Example: “Show me recent opportunities”
  • Example: “Check Slack for recent messages in the team channel”
  • Example: “What did Bob say about the demo?”
  • Example: “Any urgent messages in Slack?”
  • Example: “Check Jira for open tickets”
  • Example: “Look up GitHub issues”
  • Example: “Query our internal database”
  • All processing happens within AWS
  • No third-party API keys to manage
  • No data leaves your AWS environment
  • IAM-based access control
  • AWS handles authentication automatically
  • Bedrock sessions are encrypted
  • Connections close automatically after inactivity
  • No persistent storage of conversation data
  • Lambda execution role has minimal required permissions
  • Strands Lambda access controlled by IAM
  • Virtual participant isolated per meeting
  • No cross-meeting data access
  • Recommended for most use cases
  • Only connects when activated
  • Automatically disconnects after timeout
  • Saves ~90% of connection costs
  • Use only when immediate response is critical
  • Continuous Bedrock connection
  • Higher costs
  • Consider for high-priority meetings only
  • Tools execute in background (v0.2.27+)
  • Nova stays responsive during tool execution
  • Better user experience
  • No additional cost
  • Pay only for audio duration
  • Monitor usage in CloudWatch metrics
  • Set budget alerts in AWS Budgets
  • Consider wake phrase mode for cost savings

Set multiple wake phrases:

Terminal window
VoiceAssistantWakePhrases="hey alex,ok alex,hi alex,hello alex"

Set longer activation for complex queries:

Terminal window
VoiceAssistantActivationDuration=60 # 60 seconds
  • Session stays open during tool use and audio playback
  • No more premature disconnections
  • Smoother conversation flow
  • Tools execute in background without blocking
  • Nova remains responsive during tool execution
  • Better user experience for complex queries
  • Nova announces: “Let me search for that information. This may take a moment.”
  • Sets proper user expectations
  • Confirmation-based prompting strategy

Comparison: AWS Nova Sonic 2 vs ElevenLabs

Section titled “Comparison: AWS Nova Sonic 2 vs ElevenLabs”
FeatureAWS Nova Sonic 2ElevenLabs
Setup ComplexitySimple (2 parameters)Moderate (API key, agent config)
External DependenciesNoneElevenLabs account required
API KeysNot requiredRequired
Data LocationStays in AWSSent to ElevenLabs
Tool UseNative Bedrock tool useClient-side tool calling
Async ProcessingYes (v0.2.27+)Depends on configuration
Cost ModelAWS Bedrock pricingElevenLabs pricing
Voice QualityHigh qualityVery high quality
CustomizationSystem prompts & 16 voice IDs via DynamoDBExtensive via API

For issues specific to:

  • LMA Voice Assistant: Check LMA documentation and CloudWatch logs
  • AWS Nova Sonic 2: Check AWS Bedrock documentation
  • Strands Agent: Check Strands Lambda logs in CloudWatch
  • Tool Configuration: Check LMA GitHub repository

What You Get:

  • Real-time voice assistant in meetings
  • Access to meeting history and documents
  • Web search and MCP integrations
  • Natural conversation with AI
  • Automatic tool invocation
  • Async tool processing (v0.2.27+)

What You Need:

  • AWS account with Bedrock access
  • AWS Nova Sonic 2 model enabled
  • 2 CloudFormation parameters (amazon_nova_sonic + activation mode)
  • No external API keys

Key Features:

  • Wake phrase activation (“Hey Alex”)
  • Native AWS Bedrock tool use
  • Automatic Lambda invocation
  • Secure (all in AWS)
  • Cost-optimized with wake phrase mode
  • Pre-tool acknowledgment
  • Async tool execution
  • Customizable prompts and voices via DynamoDB (no code changes needed)

Customization:

  • 3 prompt modes (base, inject, replace)
  • 16 voice IDs to choose from
  • 3 turn-taking sensitivity levels (HIGH, MEDIUM, LOW)
  • 3 meeting modes: Normal, Group Meeting (passive), Translator (live interpreter)
  • Barge-in support (interrupt Nova mid-sentence)
  • Edit via the Nova Sonic Configuration page in the LMA UI or directly in DynamoDB
  • Changes apply immediately (no redeployment)

That’s it! Your meetings now have an AI voice assistant powered by AWS Nova Sonic 2 — from active 1-on-1 conversations, to passive group meeting support, to real-time multilingual interpretation.