Skip to content

FAQ

Yes, costs are incurred based on AWS resource usage. The main billable resources are:

ResourceDescription
NAT GatewayVPC external communication (hourly + data transfer)
ECS FargateFastAPI backend container (vCPU + memory)
ElastiCache RedisWebSocket connection management
S3 / S3 Express One ZoneDocument storage, vector DB, sessions, artifacts
SageMaker EndpointPaddleOCR (ml.g5.xlarge, scales up only when in use)
BedrockPer-invocation billing (input/output tokens)
Step FunctionsPer-workflow execution state transition billing
DynamoDBRead/write capacity units

Refer to the Quick Deploy Guide - Troubleshooting section. You can check the failure cause through CodeBuild logs.

Terminal window
aws logs tail /aws/codebuild/sample-aws-idp-pipeline-deploy --since 10m

How do I keep the SageMaker endpoint always running?

Section titled “How do I keep the SageMaker endpoint always running?”

The default setting is auto-scaling 0→1, where instances scale down to 0 after 10 minutes of inactivity. To keep it always running, change the minimum instance count.

Change via AWS Console:

  1. Go to SageMaker Console > Inference > Endpoints and select the endpoint
  2. In the Endpoint runtime settings tab, select the variant and click Update scaling policy
  3. Change Minimum instance count to 1

How do I change the AI models used for analysis?

Section titled “How do I change the AI models used for analysis?”

Workflow analysis models are managed in packages/infra/src/models.json.

{
"analysis": "global.anthropic.claude-sonnet-4-6",
"summarizer": "global.anthropic.claude-haiku-4-5-20251001-v1:0",
"embedding": "amazon.nova-2-multimodal-embeddings-v1:0",
"videoAnalysis": "us.twelvelabs.pegasus-1-2-v1:0"
}
KeyPurposeLambda Environment Variable
analysisSegment analysis, Q&A regenerationBEDROCK_MODEL_ID
summarizerDocument summarizationSUMMARIZER_MODEL_ID
embeddingVector embeddingEMBEDDING_MODEL_ID
videoAnalysisVideo analysisBEDROCK_VIDEO_MODEL_ID

Method 1: Edit models.json and redeploy (Recommended)

Terminal window
# After editing models.json
pnpm nx deploy @idp-v2/infra

Method 2: Directly modify Lambda environment variables

To change immediately without redeployment, modify environment variables in the Lambda Console.

  1. Go to Lambda Console > Select the function (e.g., IDP-V2-*-SegmentAnalyzer)
  2. Configuration > Environment variables > Edit
  3. Modify the environment variable value and click Save

Documents (PDF, DOC, TXT), images (PNG, JPG, GIF, TIFF), videos (MP4, MOV, AVI), and audio files (MP3, WAV, FLAC) up to 500MB are supported.

File TypeSupported FormatsPreprocessing
DocumentPDF, DOC, TXTPaddleOCR + BDA (optional) + PDF text extraction
ImagePNG, JPG, GIF, TIFFPaddleOCR + BDA (optional)
VideoMP4, MOV, AVIAWS Transcribe + BDA (optional)
AudioMP3, WAV, FLACAWS Transcribe

Can it handle large documents (thousands of pages)?

Section titled “Can it handle large documents (thousands of pages)?”

Yes. Large documents are supported through segment-based processing with Step Functions + DynamoDB. DynamoDB is used as intermediate storage to bypass the Step Functions payload limit (256KB), and Distributed Map processes up to 30 segments simultaneously.

What OCR engines are used? What are the differences?

Section titled “What OCR engines are used? What are the differences?”
OCR EngineDescription
PaddleOCROpen-source OCR running on SageMaker. Supports 80+ languages. Optimized for text extraction
Bedrock Data Automation (BDA)AWS managed service. Analyzes document structure (tables, forms, etc.) together. Selectable in project settings

For details, see PaddleOCR on SageMaker.

  1. AWS Transcribe converts speech to text
  2. For videos, TwelveLabs Pegasus 1.2 analyzes visual content
  3. Transcription + visual analysis results are combined to generate segments
  4. The ReAct Agent performs deep analysis on each segment

What if the analysis results are inaccurate?

Section titled “What if the analysis results are inaccurate?”

You can correct results at multiple levels:

  • Q&A Regeneration: Regenerate Q&A for specific segments with custom instructions
  • Q&A Add/Delete: Manually add or delete individual Q&A items
  • Full Reanalysis: Reanalyze the entire document with new instructions

Can I customize the document analysis prompt?

Section titled “Can I customize the document analysis prompt?”

Yes. You can modify the document analysis prompt in the project settings. This prompt is used by the ReAct Agent when analyzing segments. Customizing it for your project’s domain or analysis purpose will yield more accurate results.

ModelPurpose
Claude Sonnet 4.6Segment analysis (Vision ReAct Agent), AI chat
Claude Haiku 4.5Document summarization
Amazon Nova Embed Text v1Vector embedding (1024d)
TwelveLabs Pegasus 1.2Video analysis
Cohere Rerank v3.5Search result reranking

Does the chat answer based on document content?

Section titled “Does the chat answer based on document content?”

Yes. The AI Agent automatically searches documents uploaded to the project through MCP tools. It performs hybrid search combining vector search and full-text search (FTS), reranks results with Cohere Rerank, and generates answers based on the most relevant content.

You can create customized agents with project-specific system prompts. For example, you can create agents dedicated to legal document analysis, technical document summarization, etc. You can also switch between agents during a conversation.

ToolDescription
search_documentsHybrid search across project documents
save/load/edit_markdownCreate and edit markdown files
create_pdf, extract_pdf_text/tablesPDF creation and text/table extraction
create_docx, extract_docx_text/tablesWord document creation and text/table extraction
generate_imageAI image generation
code_interpreterPython code execution

Can I attach images or documents to the chat?

Section titled “Can I attach images or documents to the chat?”

Yes. You can attach images or documents to the chat input for multimodal input. The AI Agent will analyze the attached file content and respond accordingly.


Amazon Cognito OIDC authentication is used. When you log in through Cognito on the frontend, a JWT token is issued and automatically included in backend API calls. MCP tool invocations use IAM SigV4 authentication.

DataStorage
Original files, segment imagesAmazon S3
Vector embeddings, search indicesLanceDB (S3 Express One Zone)
Project/workflow metadataAmazon DynamoDB
Chat sessions, agent prompts, artifactsAmazon S3
WebSocket connection infoAmazon ElastiCache Redis

LanceDB is stored on S3 Express One Zone, making direct access difficult. You can query it via Lambda from CloudShell.

List tables

Terminal window
aws lambda invoke --function-name idp-v2-lancedb-service \
--payload '{"action": "list_tables", "params": {}}' \
--cli-binary-format raw-in-base64-out \
/dev/stdout 2>/dev/null | jq .

Count records for a specific project

Terminal window
aws lambda invoke --function-name idp-v2-lancedb-service \
--payload '{"action": "count", "params": {"project_id": "YOUR_PROJECT_ID"}}' \
--cli-binary-format raw-in-base64-out \
/dev/stdout 2>/dev/null | jq .

Query segments for a specific workflow

Terminal window
aws lambda invoke --function-name idp-v2-lancedb-service \
--payload '{"action": "get_segments", "params": {"project_id": "YOUR_PROJECT_ID", "workflow_id": "YOUR_WORKFLOW_ID"}}' \
--cli-binary-format raw-in-base64-out \
/dev/stdout 2>/dev/null | jq .

Search (hybrid: vector + keyword)

Terminal window
aws lambda invoke --function-name idp-v2-lancedb-service \
--payload '{"action": "search", "params": {"project_id": "YOUR_PROJECT_ID", "query": "search query", "limit": 5}}' \
--cli-binary-format raw-in-base64-out \
/dev/stdout 2>/dev/null | jq .