SCD Definition: Change of dimension over a course of time rather changing on regular scheduled time.
Why do dimensions change?
- Someone decides they hate iPhone and want Android now
- Someone migrates from team dog to team cat
- Someone migrates from Nepal to Australia etc.
Why it is needed?: In Datawarehouse world, it is sometimes critical to track the dimension change with course of time. This helps us to better track of data and also create efficient products depending on the use-cases.
data:image/s3,"s3://crabby-images/0e027/0e02724c8a61409e2c3582a1b0aa97009353bcbb" alt=""
Type 0: Retain Original
In Type 0 SCD, the original data is never changed. This method maintains the initial state of the data without any updates.
data:image/s3,"s3://crabby-images/12279/12279d0b175b177fc603a188690cf8c5ff437275" alt=""
Pros: Maintains the original data; Simple to implement.
Cons: No updates are possible; Not suitable for dynamic data.
When to Use:
- When the original data must remain unchanged (e.g., Date of Birth).
- For audit trails or historical records.
Use Cases:
- Government agencies use Type 0 SCD to maintain original records for legal and audit purposes.
Type 1: Overwrite
In Type 1 SCD, the old data is overwritten with new data. This method does not maintain any historical data.
data:image/s3,"s3://crabby-images/47dce/47dcec58ce3378a15452ccc0d3a17f01149267af" alt=""
Pros: Simple to implement; Requires less storage space.
Cons: No historical data is maintained; Changes are not traceable.
When to Use:
- When historical data is not important.
- For attributes that are not critical for analysis.
Use Case:
- Retail companies often use Type 1 SCD for updating employee departments where historical tracking is not necessary.
Type 2: Add New Row
Description: Type 2 SCD involves adding a new row for each change, preserving historical data. This method allows you to track changes over time.
data:image/s3,"s3://crabby-images/1888d/1888d8003b22a1a095accdacb13a8f2e909179c5" alt=""
Pros: Maintains complete historical data; Changes are easily traceable.
Cons: Requires more storage space; Can lead to larger tables and slower queries.
When to Use:
- When tracking historical changes is important.
- For critical attributes that require historical analysis.
Use Case:
- Financial institutions use Type 2 SCD to track changes in employee roles for compliance and auditing purposes.
Type 3: Add New Attribute
In Type 3 SCD, a new column is added to store the previous value of the attribute. This method allows you to track limited historical data.
data:image/s3,"s3://crabby-images/d141d/d141d6d3bc649f8aca57e8d326723f3fda0d6669" alt=""
Pros: Maintains limited historical data; Simple to implement.
Cons: Only tracks the most recent change; Limited historical analysis.
When to Use:
- When only the most recent change needs to be tracked.
- For attributes with infrequent changes.
Use Case:
- Telecommunications companies use Type 3 SCD to track changes in employee roles or departments.
Type 4: Add Historical Table
Type 4 SCD involves maintaining a separate historical table to store the old data. The main table keeps only the current data.
data:image/s3,"s3://crabby-images/c7c68/c7c683336d252fc83feb0ad63d0587fa46ae533a" alt=""
Pros: Maintains complete historical data in a separate table; Keeps the main table small and efficient.
Cons: Requires managing two tables; More complex queries to join historical and current data.
When to Use:
- When historical data needs to be maintained separately.
- For large datasets where performance is a concern.
Use Case:
- Healthcare organizations use Type 4 SCD to maintain employee records and track changes over time.
Type 6: Hybrid Approach
Type 6 SCD combines the techniques of Types 1, 2, and 3 to provide a comprehensive solution. It involves adding new rows, new attributes, and maintaining historical data.
data:image/s3,"s3://crabby-images/1154e/1154e8b64e2200fed9c19ae0d56ed0ed6b344ef9" alt=""
Pros: Combines benefits of Types 1, 2, and 3; Provides comprehensive historical tracking.
Cons: More complex to implement; Requires more storage and management.
When to Use:
- When a comprehensive solution is needed.
- For critical attributes requiring detailed historical analysis.
Use Case:
- E-commerce companies use Type 6 SCD to track changes in employee roles, departments, and other critical attributes.
Discover more from Data Engineer Journey
Subscribe to get the latest posts sent to your email.