Features
Modular, comprehensive and ready to use
This solution provides ready-to-use code so you can start experimenting with a variety of Large Language Models and Multimodal Language Models, settings and prompts in your own AWS account.
Supported model providers:
- Amazon Bedrock
- Amazon SageMaker self-hosted models from Foundation, Jumpstart and HuggingFace.
- Third-party providers via API such as Anthropic, Cohere, AI21 Labs, OpenAI, etc. See available langchain integrations for a comprehensive list.
Experiment with multimodal models
Deploy IDEFICS models on Amazon SageMaker and see how the chatbot can answer questions about images, describe visual content, generate text grounded in multiple images.
Currently, the following multimodal models are supported:
- IDEFICS 9b Instruct
- Requires
ml.g5.12xlarge
instance.
- Requires
- IDEFICS 80b Instruct
- Requires
ml.g5.48xlarge
instance.
- Requires
To have the right instance types and how to request them, read Amazon SageMaker requirements
NOTE: Make sure to review IDEFICS models license sections.
To deploy a multimodal model, follow the deploy instructions and select one of the supported models (press Space to select/deselect) from the magic-config CLI step and deploy as instructed in the above section.
⚠️ NOTE ⚠️ Amazon SageMaker are billed by the hour. Be aware of not letting this model run unused to avoid unnecessary costs.
Multi-Session Chat: evaluate multiple models at once
Send the same query to 2 to 4 separate models at once and see how each one responds based on its own learned history, context and access to the same powerful document retriever, so all requests can pull from the same up-to-date knowledge.
Experiment with multiple RAG options with Workspaces
A workspace is a logical namespace where you can upload files for indexing and storage in one of the vector databases. You can select the embeddings model and text-splitting configuration of your choice.
Unlock RAG potentials with Workspaces Debugging Tools
The solution comes with several debugging tools to help you debug RAG scenarios:
- Run RAG queries without chatbot and analyse results, scores, etc.
- Test different embeddings models directly in the UI
- Test cross encoders and analyse distances from different functions between sentences.
Full-fledged User Interface
The repository includes a CDK construct to deploy a full-fledged UI built with React to interact with the deployed LLMs/MLMs as chatbots. Hosted on Amazon S3 and distributed with Amazon CloudFront.
Protected with Amazon Cognito Authentication to help you interact and experiment with multiple LLMs/MLMs, multiple RAG engines, conversational history support and document upload/progress.
The interface layer between the UI and backend is built with AppSync for management requests and for realtime interaction with chatbot (messages and responses) using GraphQL subscriptions.
Design system provided by AWS Cloudscape Design System.