FAQ
Deployment
Section titled “Deployment”Does deployment incur costs?
Section titled “Does deployment incur costs?”Yes, costs are incurred based on AWS resource usage. The main billable resources are:
| Resource | Description |
|---|---|
| NAT Gateway | VPC external communication (hourly + data transfer) |
| ECS Fargate | FastAPI backend container (vCPU + memory) |
| ElastiCache Redis | WebSocket connection management |
| S3 / S3 Express One Zone | Document storage, vector DB, sessions, artifacts |
| SageMaker Endpoint | PaddleOCR (ml.g5.xlarge, scales up only when in use) |
| Bedrock | Per-invocation billing (input/output tokens) |
| Step Functions | Per-workflow execution state transition billing |
| DynamoDB | Read/write capacity units |
Deployment failed. What should I do?
Section titled “Deployment failed. What should I do?”Refer to the Quick Deploy Guide - Troubleshooting section. You can check the failure cause through CodeBuild logs.
aws logs tail /aws/codebuild/sample-aws-idp-pipeline-deploy --since 10mInfrastructure
Section titled “Infrastructure”How do I keep the SageMaker endpoint always running?
Section titled “How do I keep the SageMaker endpoint always running?”The default setting is auto-scaling 0→1, where instances scale down to 0 after 10 minutes of inactivity. To keep it always running, change the minimum instance count.
Change via AWS Console:
- Go to SageMaker Console > Inference > Endpoints and select the endpoint
- In the Endpoint runtime settings tab, select the variant and click Update scaling policy
- Change Minimum instance count to
1
How do I change the AI models used for analysis?
Section titled “How do I change the AI models used for analysis?”Workflow analysis models are managed in packages/infra/src/models.json.
{ "analysis": "global.anthropic.claude-sonnet-4-6", "summarizer": "global.anthropic.claude-haiku-4-5-20251001-v1:0", "embedding": "amazon.nova-2-multimodal-embeddings-v1:0", "videoAnalysis": "us.twelvelabs.pegasus-1-2-v1:0"}| Key | Purpose | Lambda Environment Variable |
|---|---|---|
analysis | Segment analysis, Q&A regeneration | BEDROCK_MODEL_ID |
summarizer | Document summarization | SUMMARIZER_MODEL_ID |
embedding | Vector embedding | EMBEDDING_MODEL_ID |
videoAnalysis | Video analysis | BEDROCK_VIDEO_MODEL_ID |
Method 1: Edit models.json and redeploy (Recommended)
# After editing models.jsonpnpm nx deploy @idp-v2/infraMethod 2: Directly modify Lambda environment variables
To change immediately without redeployment, modify environment variables in the Lambda Console.
- Go to Lambda Console > Select the function (e.g.,
IDP-V2-*-SegmentAnalyzer) - Configuration > Environment variables > Edit
- Modify the environment variable value and click Save
Document Processing
Section titled “Document Processing”What file formats are supported?
Section titled “What file formats are supported?”Documents (PDF, DOC, TXT), images (PNG, JPG, GIF, TIFF), videos (MP4, MOV, AVI), and audio files (MP3, WAV, FLAC) up to 500MB are supported.
| File Type | Supported Formats | Preprocessing |
|---|---|---|
| Document | PDF, DOC, TXT | PaddleOCR + BDA (optional) + PDF text extraction |
| Image | PNG, JPG, GIF, TIFF | PaddleOCR + BDA (optional) |
| Video | MP4, MOV, AVI | AWS Transcribe + BDA (optional) |
| Audio | MP3, WAV, FLAC | AWS Transcribe |
Can it handle large documents (thousands of pages)?
Section titled “Can it handle large documents (thousands of pages)?”Yes. Large documents are supported through segment-based processing with Step Functions + DynamoDB. DynamoDB is used as intermediate storage to bypass the Step Functions payload limit (256KB), and Distributed Map processes up to 30 segments simultaneously.
What OCR engines are used? What are the differences?
Section titled “What OCR engines are used? What are the differences?”| OCR Engine | Description |
|---|---|
| PaddleOCR | Open-source OCR running on SageMaker. Supports 80+ languages. Optimized for text extraction |
| Bedrock Data Automation (BDA) | AWS managed service. Analyzes document structure (tables, forms, etc.) together. Selectable in project settings |
For details, see PaddleOCR on SageMaker.
How are video/audio files analyzed?
Section titled “How are video/audio files analyzed?”- AWS Transcribe converts speech to text
- For videos, TwelveLabs Pegasus 1.2 analyzes visual content
- Transcription + visual analysis results are combined to generate segments
- The ReAct Agent performs deep analysis on each segment
AI Analysis
Section titled “AI Analysis”What if the analysis results are inaccurate?
Section titled “What if the analysis results are inaccurate?”You can correct results at multiple levels:
- Q&A Regeneration: Regenerate Q&A for specific segments with custom instructions
- Q&A Add/Delete: Manually add or delete individual Q&A items
- Full Reanalysis: Reanalyze the entire document with new instructions
Can I customize the document analysis prompt?
Section titled “Can I customize the document analysis prompt?”Yes. You can modify the document analysis prompt in the project settings. This prompt is used by the ReAct Agent when analyzing segments. Customizing it for your project’s domain or analysis purpose will yield more accurate results.
What AI models are used?
Section titled “What AI models are used?”| Model | Purpose |
|---|---|
| Claude Sonnet 4.6 | Segment analysis (Vision ReAct Agent), AI chat |
| Claude Haiku 4.5 | Document summarization |
| Amazon Nova Embed Text v1 | Vector embedding (1024d) |
| TwelveLabs Pegasus 1.2 | Video analysis |
| Cohere Rerank v3.5 | Search result reranking |
AI Chat
Section titled “AI Chat”Does the chat answer based on document content?
Section titled “Does the chat answer based on document content?”Yes. The AI Agent automatically searches documents uploaded to the project through MCP tools. It performs hybrid search combining vector search and full-text search (FTS), reranks results with Cohere Rerank, and generates answers based on the most relevant content.
What are custom agents?
Section titled “What are custom agents?”You can create customized agents with project-specific system prompts. For example, you can create agents dedicated to legal document analysis, technical document summarization, etc. You can also switch between agents during a conversation.
What tools can the agent use?
Section titled “What tools can the agent use?”| Tool | Description |
|---|---|
| search_documents | Hybrid search across project documents |
| save/load/edit_markdown | Create and edit markdown files |
| create_pdf, extract_pdf_text/tables | PDF creation and text/table extraction |
| create_docx, extract_docx_text/tables | Word document creation and text/table extraction |
| generate_image | AI image generation |
| code_interpreter | Python code execution |
Can I attach images or documents to the chat?
Section titled “Can I attach images or documents to the chat?”Yes. You can attach images or documents to the chat input for multimodal input. The AI Agent will analyze the attached file content and respond accordingly.
Security
Section titled “Security”How is authentication handled?
Section titled “How is authentication handled?”Amazon Cognito OIDC authentication is used. When you log in through Cognito on the frontend, a JWT token is issued and automatically included in backend API calls. MCP tool invocations use IAM SigV4 authentication.
Where is data stored?
Section titled “Where is data stored?”| Data | Storage |
|---|---|
| Original files, segment images | Amazon S3 |
| Vector embeddings, search indices | LanceDB (S3 Express One Zone) |
| Project/workflow metadata | Amazon DynamoDB |
| Chat sessions, agent prompts, artifacts | Amazon S3 |
| WebSocket connection info | Amazon ElastiCache Redis |
Can I directly access LanceDB data?
Section titled “Can I directly access LanceDB data?”LanceDB is stored on S3 Express One Zone, making direct access difficult. You can query it via Lambda from CloudShell.
List tables
aws lambda invoke --function-name idp-v2-lancedb-service \ --payload '{"action": "list_tables", "params": {}}' \ --cli-binary-format raw-in-base64-out \ /dev/stdout 2>/dev/null | jq .Count records for a specific project
aws lambda invoke --function-name idp-v2-lancedb-service \ --payload '{"action": "count", "params": {"project_id": "YOUR_PROJECT_ID"}}' \ --cli-binary-format raw-in-base64-out \ /dev/stdout 2>/dev/null | jq .Query segments for a specific workflow
aws lambda invoke --function-name idp-v2-lancedb-service \ --payload '{"action": "get_segments", "params": {"project_id": "YOUR_PROJECT_ID", "workflow_id": "YOUR_WORKFLOW_ID"}}' \ --cli-binary-format raw-in-base64-out \ /dev/stdout 2>/dev/null | jq .Search (hybrid: vector + keyword)
aws lambda invoke --function-name idp-v2-lancedb-service \ --payload '{"action": "search", "params": {"project_id": "YOUR_PROJECT_ID", "query": "search query", "limit": 5}}' \ --cli-binary-format raw-in-base64-out \ /dev/stdout 2>/dev/null | jq .