Operational Data Store Consolidation and Aggregation
Overview
Many data warehouses will bring operational system data into a single environment using the same structure as the source, and then collapse this data into a format which is more suitable for querying in aggregated form.
Walkthrough of the Architecture
- Source databases, whether on-premises or on AWS are replicated to Amazon Redshift using the Database Migration Service. This service connects to the source system Change Data Capture logs, or uses SCT Extraction Agents to extract data from the source systems
- Source data is placed into schemas on Amazon Redshift that are mirror copies of the source systems. DMS replicates not just the data changes, but also any schema structure changes (for sources which support CDC replication).
- Aggregation tables for raw events may be created within each source schema in Redshift, or alternatively a Star Schema may be used to create a more structured analytical system
- Data is transformed from the source schemas to the target tables using a third-party ETL tool or pushdown SQL transformation.