Hello All,
I have some guidance regarding designing a master data model for the customer base, where the client is seeking to build MDM for analytical purposes
The requirements as following:
We have two data source with different data structure, each data source holds number of customer as following - this data is fake, but it serves the requirements:
Data source 1:
PK | National ID | Name | Location | |
1 | A120 | C. Ronaldo | Germany | cr@company.com |
2 | A120 | C. Ronaldo | Germany | cr@company.com |
3 | A120 | C. Ronaldo | Germany | cr@company.com |
Data source 2:
ID | Name | Address | |
123456 | Cristiano Muller Ronaldo | 170 St., Berlin | cr@company.com |
The requirement as following:
- We need to cleanse the duplicates of the customer in Data source 1, however, each record has many transactions.
- We need to match the customer in Data source 1 with Data source 2 based on their names
- We need to keep track the record that we merge them, in order to let the data steward for unmerge them
- We need to keep the business keys of each of the data source – example, PK in Data source 1, and ID in Data source 2, the objective as we mentioned is MDM for analytical purposes, so the data warehouse that will extract the data from MDS hub shall know the transactions ID in order to retrieve their transaction from the data sources.
- We need to have Row-Level security
- We need to do the matching and cleansing the data automatically, as my client is not interesting in Excel features, since the integration will be done through SSIS because we have millions of records, and part of them will be done automatically.
The big question is how to achieve these requirements through leveraging the features of SQL DQS + MDS 2012