Hi PMerrill,
Thanks for the comment.
"I'm a big fan of questioning our assumptions, so thanks for this conversation"
I'm glad at least one person out there appreciates it :)
"what about the ETL Queries? They need to be fast and accurate."
Actually, I'd rephrase and say that its more important to be accurate (i.e. correct) than fast. Being as fast as possible is of course important, but not as important as being correct. "Accurate" and "fast" are not mutually exclusive, but its more important to be accurate than fast.
"In our data, most of our data is not contiguous. We need to know when there is no value for a member, not just which value applies at what time. I don't see how that's possible without end_dates."
If I may say so, that's not the same scenario as what I'm talking about in this blog post. There is a difference between "the end date for a member" and "the end date for a SCD record". Let me try and elucidate that statement with an example.
Say that we had a dimension table [Employee]. There may be attributes [Employee].[StartDateofEmployment] & [Employee].[EndDateofEmployment]. In such cases it makes complete sense to store both and I would never advocate different.
The blog post above though is not referring to dimension attributes, its referring only to the SCD Start & End dates, i.e. the columns on the table which define the effective period of each record. Those are not dimension attributes.
The "end date for a member" as you phrase it is a dimension attribute that has meaning in the real world, it is something that an end user might be interested in seeing. That is not true of an SCD start date/end date.
Thoughts?
JT