Initial Implementation Setup Data | KEVIN RESCHENBERG 02-15-2004 |
Last week's discussion considered a few of the different types of setup data, along with how (and where) they should be
created and migrated. We saw that the different types must be handled in different ways, and that it's probably next to impossible
to set a policy that covers all types. This week I'd like to talk about handling setup data during the initial implementation.
Before a system is rolled into production for the first time, you have a little more flexibility in how you deal with these issues.
Let's look at a few options.
First, we assume that you have (at the very minimum) two environments: A "demo" environment (which I'll call DMO) and a "system" environment
(SYS). These are delivered by PeopleSoft. SYS is a "blank" database that contains all of the tables but very little data. What it does
contain is generally limited to delivered parameters in a few tables, plus, of course, the PeopleTools metadata. DMO contains the same
data plus some sample setup and transactional data. DMO can be used to try out various features to see how the system works. However, you should
not make major changes to either the setup or transactional data in this database, because it will be needed for reproducing bugs that
you suspect originate in delivered code and that you are reporting to PeopleSoft.
In general, SYS is the starting point for your setup. (I have seen installations where many of the setup tables contain the "demo" sample
data, indicating that DMO may have been used as the starting point. This is not the way to go.) SYS could be modified directly but
is normally copied to one or more environments instead. It's best to keep DMO maintained and up to date with all patches you apply. Similarly,
SYS should be maintained, at least during the early stages of your implementation. Later on you may decide there is no further need for it.
Let's assume that in addition to DMO and SYS, you have created one or more environments. The simplest option might be to have one environment
that you use for development and that also will be the future production database. There would be serious drawbacks to this option, however.
First, your ongoing custom development could interfere with testing of already-developed features. (An aside: I know that you intend to "go
vanilla" with your implementation. Everybody wants to do this, but nobody actually does it.) Second, your testing will litter the
database with test data that must be cleaned out before the system goes live.
Instead, let's assume that you create, at minimum, a development environment, a testing environment, and a future production environment. I'll call
them DEV, TST and PRD. Custom code and objects would start out in DEV, be unit tested, and then be migrated to TST. There you would perform your
various integration, system and regression tests. Finally, the objects and code would be migrated to PRD. When you are ready to go live with
the system, you make PRD available to your users. Simple.
But what about your setup data? Our first idea might be to follow the same path, entering the configuration into DEV and then migrating to
TST. But Application Designer doesn't migrate setup data. We could enter the same data into DEV, TST and PRD, but that might be difficult if
there is much to enter—and how do you know that you have done exactly the same data entry into each environment? If you entered the data correctly
in DEV and TST but made a mistake while entering it into PRD, you would not see the problem until the system is live. Also, as we will discuss
in future topics, there may be good reasons to have many more environments than the ones assumed here.
Your more functionally oriented team members are probably the ones who are entering most of the setup. Why not let them enter this into TST and
then, when all testing is complete, enter the final configuration into PRD? The same issue applies—we don't know if they will accurately
enter all of the setup. Could we copy TST to create PRD? Yes, but then all of the testing data must be cleaned out—and how will you know where
all of it is?
Could we use Data Mover or SQL scripts to move the setup from one environment to the next? Yes. Under this method, you might want to have the
users enter it into TST, conduct testing, and then use these scripts to migrate the data to DEV and PRD. However, there are several issues with
this method. The first hurdle is to determine which tables must be moved. Once you do that, there is another major issue: What happens if the
scripts are not accurate and complete? TST will be fine—the users will have run many successful tests—but PRD won't work and you won't notice
this until the system is live. You could avoid this problem by entering the setup in DEV and then copying it to TST and PRD. The first migration
(to TST) will give you a chance to test the actual migration process. However, this may not be the best solution either. In many cases, developers
will need to modify the setup to create unit testing scenarios. If this happens, bad setup data could be moved to your future production database.
Also, if your system test is not thorough enough, you may not notice that the data migration was not complete.
There is not necessarily one best way to avoid all these problems, but the method I'm going to suggest now has worked well for me in the past. Have
your functional team members enter initial setup data into TST, and try out setup scenarios that they may not be sure of. Once the setup has
been settled, have them enter the final data in PRD (future production). Then create the SQL and/or Data Mover scripts,
and move the data from PRD back to TST and DEV.
What does this accomplish? First, you are using the GUI to enter your setup in PRD. This is the safest way to enter it. Your migration scripts
eliminate the need for double or triple data entry. Testing hopefully will uncover any errors in your migration script, but if one or two
errors are missed due to an incomplete test, at least the setup in PRD is still complete. (You may never know that there was a bug in the script, but
that won't affect production.) You will normally find several problems with the initial script, but these can be cleaned up quickly. You could
run this script on a scheduled basis or whenever significant changes have been made to the setup in PRD. Even though your functional users may
tend to ignore DEV, the script will continually refresh the setup in DEV with the latest version.
The big question now is how we determine which tables are to be moved. Ask the users for a list, but understand that it will be just a starting
point. The users will not necessarily know all of the tables affected, and they will forget to tell you about a few. To get a full list, what I've
done is to compare PRD with SYS. Remember that SYS was used to create PRD in the first place. Since you will allow no testing to be done in
PRD, the only differences between SYS and PRD will be the setup data, custom objects, and a few run control rows or temporary tables. You can get a good feel for
what tables have been affected by running row counts against all tables in each database and then using a file compare tool to find the differences.
For a more complete indication, write a tool or use an existing utility that compares the data field by field.
The row-count comparison is simple to write in SQL. The full data compare is about a one-day programming project using SQR; I'll
talk about that in a future column. The SQL needed to copy a table depends on your database platform and will be another future topic. (In general,
you want a statement similar to INSERT INTO thisdatabase.PS_TABLENAME SELECT * FROM thatdatabase.PS_TABLENAME.) Why not use Data Mover for
everything? If you are customizing a table by adding or deleting fields, that won't be possible.
Once you have your list of table differences, look it over. Be careful about unusual situations, such as the INSTALLATION table in HR (which is
not only a setup table but also contains transactional data: the last EMPLID assigned). Do not bother with PeopleTools tables, run control data, process monitor
tables and such.
Well, that probably sounded more difficult than it is. But as always, if you have a better solution to the problem, I'd like to hear it!
|