Quantcast
Viewing all articles
Browse latest Browse all 83

re: Debunking Kimball Effective Dates

Hi Ivan,

Thanks for the comment.

"I think what has been lost in all this discussion is that the purpose of the start/end dates is to make QUERYING the data easier."

I disagree. The primary purpose of SCD start/end dates is not for end user querying, it is for the purposes of doing lookups during the ETL process. I am of the opinion that there is no value to the end user of exposing lineage columns such as these.

If you *do* require your users to be able to query such columns for whatever reason then that implies that they have value to your end users other than being SCD start/and dates and in such circumstances, yes, store them explicitly as you do all other attributes that are available for end users.

"that, plus the ability to keep discontinuous date ranges (e.g. start date of row 3 is greater than end date of row 2, or end date of the final row for a key's value is before current_timestamp), supports Kimball's design."

Interesting perspective. Could you perhaps outline such a scenario where this would be a requirement?

"That storing an additional date column "takes up too much space" is a flawed argument: disk space is cheap."

Disk space is cheap. Memory is not cheap though. Neither is I/O. For those reasons, if I can save a few bytes on each record then I'll do it.

Thanks again for sharing your thoughts Ivan.


Viewing all articles
Browse latest Browse all 83

Trending Articles