Denormalized Tables | KEVIN RESCHENBERG 04-04-2004 |
So a few of us were having dinner the other day and the topic of the (HRMS) PERSONAL_DATA table came up. (I'm sure that
you've had this sort of dinner discussion many times as well!) With this table being removed at the newer releases,
what does this mean to us?
PERSONAL_DATA is one of those "denormalized" tables. Sometimes it is convenient to gather information from various tables
into one place for quicker access and easier coding. This is the case when we already have information in normalized
tables. In some cases, though, a table started out in this "denormalized" state and the information is evolving into
a more normalized state with each new release. That's the situation with PERSONAL_DATA. But either way, we can generally categorize
tables into the normalized vs. denormalized groupings.
PERSONAL_DATA contains information related to a person: name, address, date of birth and so on. Some of this information
has been added to newer tables such as NAMES and ADDRESSES. These tables support effective-dated history. Other information has
been added to the relatively new PERSON table. This leaves PERSONAL_DATA as a denormalized table that exists only for our
convenience. At release 8.3, look at the record properties and you will see a comment saying that this is now a view instead
of a table. That's not true; it was apparently the plan, but the table still exists.
I've been pretty inconsistent in the way I treat these denormalized tables. On the one hand, I generally recommend against using
the EMPLOYEES table. This one is built by a process (generally run daily) and used for reporting. I'm probably reacting to a few bad
experiences. At one client a few years ago, I was asked to look into a problem with some of their reports. It seemed that
old information was showing up on the reports. A little investigation revealed that the client had been running the process
every day—but with the same as-of date! The information in the table was two months old at that point. At another client,
the process occasionally stopped with an error condition but it was not routinely checked, again resulting in old data in the table.
As a contractor, I know I won't be around at all times to monitor this, so I tend to go the safer route and use the "live" data
tables instead of EMPLOYEES.
But I use PERSONAL_DATA all the time. What's the difference? Well, PERSONAL_DATA is used throughout the system—on pages, in
PeopleCode, in reports—everywhere. If it's wrong, hopefully this will be noticed immediately. Also, it is updated not only by
a process, but also in real time through application messaging.
These are not the only examples of denormalized tables. Even the all-important JOB table is something of a denormalized table.
Consider the pay rate summary fields collected from COMPENSATION; the SetID fields such as SETID_LOCATION; and the fact that changing
any of the data on JOB requires a complete new row, including all of the information that didn't change.
"Normalize until it hurts; denormalize until it works." We need denormalized tables at times. Too bad PERSONAL_DATA is going away.
How about a view?
|