Skip to content

Source to Target Mappings

Date Dim

Table Name: date_dim

Table Type: Dimension

Description: Each row in this table represents one calendar day.

Load Frequency: Static (load once)

Source: Loaded once when the DW is deployed from a DDL file.

Target Schema: Target Table: Target Column: Target Datatype: Key: SCD Type: Transformation: Note:
dw_mystore date_dim date_key date PK N/A Using the date datatype for the key as it will allow range restricted scans on fact tables.
dw_mystore date_dim day_date date NK N/A Source column is repeated here for user convenience as primary keys are often hidden in BI presentation layers.
dw_mystore date_dim month_seq int N/A
dw_mystore date_dim week_seq int N/A
dw_mystore date_dim quarter_seq int N/A
dw_mystore date_dim year int N/A
dw_mystore date_dim dow int N/A
dw_mystore date_dim moy int N/A
dw_mystore date_dim dom int N/A
dw_mystore date_dim qoy int N/A
dw_mystore date_dim fy_year int N/A
dw_mystore date_dim fy_quarter_seq int N/A
dw_mystore date_dim fy_week_seq int N/A
dw_mystore date_dim day_name varchar(9) N/A
dw_mystore date_dim quarter_name varchar(6) N/A
dw_mystore date_dim holiday varchar(1) N/A
dw_mystore date_dim weekend varchar(1) N/A
dw_mystore date_dim following_holiday varchar(1) N/A
dw_mystore date_dim first_dom int N/A
dw_mystore date_dim last_dom int N/A
dw_mystore date_dim same_day_ly int N/A
dw_mystore date_dim same_day_lq int N/A
dw_mystore date_dim current_day varchar(1) N/A
dw_mystore date_dim current_week varchar(1) N/A
dw_mystore date_dim current_month varchar(1) N/A
dw_mystore date_dim current_quarter varchar(1) N/A
dw_mystore date_dim current_year varchar(1) N/A
dw_mystore date_dim dw_insert_date timestamp N/A SYSDATE on Insert SYSDATE on Insert
dw_mystore date_dim dw_update_date timestamp N/A SYSDATE on Update SYSDATE on Update

Time Dim

Table Name: time_dim

Description: Each row in this table represents one second.

Load Frequency: Static (load once)

Source: Loaded once when the DW is deployed from a DDL file.

Target Schema: Target Table: Target Column: Target Datatype: Key: SCD Type: Transformation:
dw_mystore time_dim time_key int PK N/A
dw_mystore time_dim hour int NK N/A
dw_mystore time_dim minute int NK N/A
dw_mystore time_dim second int NK N/A
dw_mystore time_dim am_pm varchar(2) N/A
dw_mystore time_dim shift varchar(20) N/A
dw_mystore time_dim sub_shift varchar(20) N/A
dw_mystore time_dim meal_time varchar(20) N/A
dw_mystore time_dim dw_insert_date timestamp N/A SYSDATE on Insert
dw_mystore time_dim dw_update_date timestamp N/A SYSDATE on Update

Customer Dim

Table Name: customer_dim

Table Type: Dimension

Description: Each row in this table represents a customer.

Load Frequency: Dynamic

Source Tables: store_customer, store_customer_address

Source Table Joins: FROM stg_mystore.v_store_customer c LEFT OUTER JOIN stg_mystore.v_store_customer_address a ON c.address_id = a.address_id

Transformation Notes:

All the columns from the store_customer table are treated as SCD1 in the target.

All the columns from the store_customer_address table are treated as SCD2 in the target.

Null source values are transformed to 'Unknown' or '#' in the target, depending on the target column width. Null source dates/timestamps are transformed to '01/01/1000'.

The source data lake tables are HUDI tables. HUDI returns timestamps in the Unix time (epoch) format. The Unix time columns are converted via a SQL function and are encapsulated in views: v_store_customer and v_store_customer_address.

Source Data Lake Layer: Source Database: Source Table: Source Column: Source Datatype: Target Schema: Target Table: Target Column: Target Datatype: Key: SCD Type: Transformation: Note:
dw_mystore customer_dim customer_key int PK N/A Create new surrogate key for new records and SCD2 changes.
clean ara_clean_data_ store_customer customer_id string dw_mystore customer_dim customer_id varchar(16) NK N/A
clean ara_clean_data_ store_customer salutation string dw_mystore customer_dim salutation varchar(10) 1 The full history is updated on SCD1
 columns for a change in the source.
clean ara_clean_data_ store_customer first_name string dw_mystore customer_dim first_name varchar(20) 1
clean ara_clean_data_ store_customer last_name string dw_mystore customer_dim last_name varchar(30) 1
clean ara_clean_data_ store_customer birth_country string dw_mystore customer_dim birth_country varchar(20) 1
clean ara_clean_data_ store_customer email_address string dw_mystore customer_dim email_address varchar(50) 1
clean ara_clean_data_ store_customer birth_date date dw_mystore customer_dim birth_date date 1
clean ara_clean_data_ store_customer gender string dw_mystore customer_dim gender varchar(10) 1 if 'M' then 'Male'
 if 'F' then 'Female'
 else gender
clean ara_clean_data_ store_customer marital_status string dw_mystore customer_dim marital_status varchar(10) 1 if 'D' then 'Divorced'
 if 'M' then 'Married'
 if 'S' then 'Single'
 if 'U' then 'Unknown'
 if 'W' then 'Widowed'
 else marital_status
clean ara_clean_data_ store_customer education_status string dw_mystore customer_dim education_status varchar(10) 1
clean ara_clean_data_ store_customer purchase_estimate bigint dw_mystore customer_dim purchase_estimate int 1
clean ara_clean_data_ store_customer credit_rating string dw_mystore customer_dim credit_rating varchar(10) 1
clean ara_clean_data_ store_customer buy_potential string dw_mystore customer_dim buy_potential varchar(10) 1
clean ara_clean_data_ store_customer vehicle_count bigint dw_mystore customer_dim vehicle_count int 1
clean ara_clean_data_ store_customer lower_bound bigint dw_mystore customer_dim income_band_lower_bound int 1
clean ara_clean_data_ store_customer upper_bound bigint dw_mystore customer_dim income_band_upper_bound int 1
clean ara_clean_data_ store_customer customer_datetime date dw_mystore customer_dim start_date date 1 f_from_unixtime(customer_datetime)
clean ara_clean_data_ store_customer_address address_id string dw_mystore customer_dim address_id varchar(16) 2
clean ara_clean_data_ store_customer_address address_datetime string dw_mystore customer_dim address_date timestamp 2 f_from_unixtime(address_datetime)
clean ara_clean_data_ store_customer_address street string dw_mystore customer_dim street varchar(20) 2
clean ara_clean_data_ store_customer_address city string dw_mystore customer_dim city varchar(60) 2
clean ara_clean_data_ store_customer_address county string dw_mystore customer_dim county varchar(30) 2
clean ara_clean_data_ store_customer_address state string dw_mystore customer_dim state varchar(2) 2
clean ara_clean_data_ store_customer_address zip string dw_mystore customer_dim zip varchar(10) 2
clean ara_clean_data_ store_customer_address country string dw_mystore customer_dim country varchar(20) 2
clean ara_clean_data_ store_customer_address gmt_offset string dw_mystore customer_dim gmt_offset numeric(5) 2
clean ara_clean_data_ store_customer_address location_type string dw_mystore customer_dim location_type varchar(20) 2
dw_mystore customer_dim scd_start_date timestamp N/A On SCD2 change:
- store_customer.customer_datetime on first record.

- store_address.address_datetime on the new record
dw_mystore customer_dim scd_end_date timestamp N/A On SCD2 change:
 - store_address.address_datetime on the existing current record
 - '31/12/2999' on the new record
dw_mystore customer_dim scd_current_flag varchar(1) N/A On SCD2 change:
 - 'N' on existing record
 - 'Y' on new record
dw_mystore customer_dim dw_insert_date timestamp N/A SYSDATE on Insert
dw_mystore customer_dim dw_update_date timestamp N/A SYSDATE on Update

Sale Fact

Table Name: sale_fact

Table Type: Fact

Description: Each row in this table represents a single line item from a sale.

Load Frequency: Dynamic

Source Tables:

Lake: store_sale

DW: date_dim, time_dim, customer_dim

Source Data Lake Layer: Source Database: Source Table: Source Column: Source Datatype: Target Schema: Target Table: Target Column: Target Datatype: Key: Transformation: Note:
N/A Sourced from DW dw_mystore schema date_dim date_key date dw_mystore sale_fact sold_date_key date Surrogate key lookup on:

 SELECT NVL(dd.date_key, TO_DATE('01-JAN-1900', 'DD-MON-YYYY'))
 FROM clean_store_sale ss
 LEFT OUTER JOIN dw_mystore.date_dim dd
 ON TRUNC(ss.sale_datetime) = dd.day_date
N/A Sourced from DW dw_mystore schema time_dim time_key int dw_mystore sale_fact sold_time_key int Surrogate key lookup:

 SELECT NVL(td.time_key, -1)
 FROM store_sale ss
 LEFT OUTER JOIN dw_mystore.time_dim td
  ON EXTRACT(hour from ss.sale_datetime) = td."hour"
 AND EXTRACT(min from ss.sale_datetime) = td."minute"
 AND EXTRACT(sec from ss.sale_datetime) = td."second"
clean ara_clean_data_ store_sale sale_datetime timestamp dw_mystore sale_fact sold_date timestamp PK
N/A Sourced from DW dw_mystore schema customer_dim customer_key int dw_mystore sale_fact customer_key int Surrogate key lookup:

 SELECT NVL(cd.customer_key, -1)
 FROM store_sale ss
 LEFT OUTER JOIN dw_mystore.customer_dim cd
 ON ss.customer_id = cd.customer_id
 AND ss.sold_datetime >= cd.scd_start_date
 AND ss.sold_datetime < cd.scd_end_date
clean ara_clean_data_ store_sale ticket_id bigint dw_mystore sale_fact ticket_id varchar(20) PK
clean ara_clean_data_ store_sale item_id bigint dw_mystore sale_fact item_id int PK
clean ara_clean_data_ store_sale quantity bigint dw_mystore sale_fact quantity int
clean ara_clean_data_ store_sale wholesale_cost decimal(15,2) dw_mystore sale_fact wholesale_cost decimal(15,2)
clean ara_clean_data_ store_sale list_price decimal(15,2) dw_mystore sale_fact list_price decimal(15,2)
clean ara_clean_data_ store_sale sales_price decimal(15,2) dw_mystore sale_fact sales_price decimal(15,2)
clean ara_clean_data_ store_sale ext_discount_amt decimal(15,2) dw_mystore sale_fact ext_discount_amt decimal(15,2)
clean ara_clean_data_ store_sale ext_sales_price decimal(15,2) dw_mystore sale_fact ext_sales_price decimal(15,2)
clean ara_clean_data_ store_sale ext_wholesale_cost decimal(15,2) dw_mystore sale_fact ext_wholesale_cost decimal(15,2)
clean ara_clean_data_ store_sale ext_list_price decimal(15,2) dw_mystore sale_fact ext_list_price decimal(15,2)
clean ara_clean_data_ store_sale ext_tax decimal(15,2) dw_mystore sale_fact ext_tax decimal(15,2)
clean ara_clean_data_ store_sale coupon_amt decimal(15,2) dw_mystore sale_fact coupon_amt decimal(15,2)
clean ara_clean_data_ store_sale net_paid decimal(15,2) dw_mystore sale_fact net_paid decimal(15,2)
clean ara_clean_data_ store_sale net_paid_inc_tax decimal(15,2) dw_mystore sale_fact net_paid_inc_tax decimal(15,2)
clean ara_clean_data_ store_sale net_profit decimal(15,2) dw_mystore sale_fact net_profit decimal(15,2)
customer_dim dw_insert_date timestamp SYSDATE on Insert
customer_dim dw_update_date timestamp SYSDATE on Update