Skip to content

Features

Manage documents and analysis results on a per-project basis.

  • Create / edit / delete projects
  • Language settings (Korean, English, Japanese)
  • Custom document analysis prompts
  • Per-project color themes

Upload files in various formats and the system automatically detects the file type and routes them to the appropriate preprocessing pipeline. Supports documents (PDF, DOC, TXT), images (PNG, JPG, GIF, TIFF), videos (MP4, MOV, AVI), and audio files (MP3, WAV, FLAC) up to 500MB.

  • Drag-and-drop / multi-file upload
  • Automatic file type detection and pipeline routing
  • PaddleOCR (SageMaker) text extraction
  • Bedrock Data Automation document structure analysis (optional)
  • PDF text layer extraction (pypdf)
  • Audio/video transcription (AWS Transcribe)

For detailed preprocessing flows by file type, see Preprocessing Pipeline.

File TypeSupported FormatsPreprocessing
DocumentsPDF, DOC, TXTPaddleOCR + BDA (optional) + PDF text extraction
ImagesPNG, JPG, GIF, TIFFPaddleOCR + BDA (optional)
VideosMP4, MOV, AVIAWS Transcribe + BDA (optional)
AudioMP3, WAV, FLACAWS Transcribe

Uploaded documents are automatically analyzed through a Step Functions workflow. A Strands SDK-based ReAct Agent performs in-depth analysis on each segment using an iterative question-answer approach.

  • Per-segment deep analysis (Claude Sonnet 4.6 Vision ReAct Agent)
  • Video analysis (TwelveLabs Pegasus 1.2)
  • Document summary generation (Claude Haiku 4.5)
  • Vector embedding and storage (Nova Embed 1024d → LanceDB)
  • Reanalysis / Q&A regeneration / Q&A add and delete
Document Upload
→ Preprocessing (OCR, BDA, Transcribe)
→ Segment Splitting
→ Distributed Map (max 30 concurrency)
→ ReAct Agent Analysis (per segment)
→ Vector Embedding → LanceDB Storage
→ Document Summary Generation

For detailed analysis flow, see AI Analysis Pipeline.


Workflow progress is delivered to the frontend in real-time via WebSocket. DynamoDB Streams detects state changes, looks up active connections in Redis, and pushes events through the WebSocket API.

  • Step-level start / complete / error notifications
  • Segment analysis progress (X/Y completed)
  • Workflow start / complete / error notifications
  • Artifact and session creation events

View detailed per-segment results for analyzed documents.

  • Per-segment OCR / BDA / PDF text / AI analysis results
  • Video segment timecode-based viewing
  • Transcription segment review
  • Q&A regeneration (with custom instructions)
  • Q&A add and delete
  • Full document reanalysis

A conversational AI interface powered by Bedrock Agent Core. IDP Agent and Research Agent leverage tools via MCP Gateway for document search, artifact generation, and more, performing Q&A based on documents uploaded to the project.

  • Streaming responses with real-time tool usage display
  • Image/document attachments (multi-modal input)
  • Markdown rendering and code highlighting
  • Session management (create / rename / delete, conversation history preserved)

The AI Agent automatically searches project documents during conversations.

  • Vector search + Full-Text Search (FTS)
  • Korean morphological analysis (Kiwi)
  • Cohere Rerank v3.5 result reranking

Create custom agents per project for specialized analysis.

  • Agent name and system prompt configuration
  • Create / edit / delete agents
  • Switch agents during conversations
ToolDescription
search_documentsHybrid search across project documents
save/load/edit_markdownCreate and edit markdown files
create_pdf, extract_pdf_text/tablesGenerate PDFs and extract text/tables
create_docx, extract_docx_text/tablesGenerate Word documents and extract text/tables
generate_imageAI image generation
code_interpreterPython code execution

Manage files generated by agents during AI chat (PDF, DOCX, images, markdown, etc.) in the artifact gallery.

  • Browse and search artifacts
  • Inline preview (images, PDF, markdown, HTML, DOCX)
  • Download and delete
  • Filter by project
  • Navigate to originating project