Today i had to create a package where i had to create a flat aggregation table with facts and measures for a operational system, the problem is that the data keeps refreshing every minute whereas the data is aggregate over an hour. That means data has to be updated or inserted.
My initial thought was to use SSIS to retrieve all of the rows intended for the dimension table in a data flow via a source component. Then the rows would be run through a lookup component to determine if they existed in the target table. Rows to be updated (those found in the lookup) would be run through an OLE DB Command component. Rows to be inserted would be sent to an OLE DB Destination.
But then I remembered the new SQL 2008 Merge statement which will give you the ability to write one single statement to update the row if it already exists otherwise it will insert a new row. Too bad there isn’t a Merge destination component in SSIS.
To use the merge statement I created a simple package with 2 SQL Execute tasks, one that inserts the new data in a temporary table and one that merges the data from the temporary table to the new aggregation table. I used the measure values as JOIN condition for the merge. Realy simple and not much work.
The merge statement is a great tool to keep the flow of update / insert really simple and without different timelines.
Sites i got my info from: