Why your dimensions in the data warehouse should always have a surrogate primary key?
There is a very detailed and nicely written article Ralph Kimball http://www.kimballgroup.com/1998/05/surrogate-keys/
Key takeouts
1. The primary key coming from OLTP is the natural primary key and that is meant to be for OLTP system. This primary key should be enough to give some information of the record.
While on the other the hand, a surrogate key is just a simple integer created to serve as a primary key for data warehouses.
2. Consider a scenario where we use the natural primary key as the primary key for the dimension as well. Let's say you update information of CUST1 in OLTP. In this case, you will not be able to use Slow Dimension Changing data load with history keeping.
3. In case your company acquires a new company with different formatting for Customer primary key, merging both systems into DW would be an issue.
No comments:
Post a Comment