Planning and forethought are key elements in designing a database that is easy to work with and easy to expand. Putting together a feasible and workable relational database is not an easy task. It is similar to piecing together a puzzle in order to have one complete, concise, and manageable unit. Much thought must go into the design of the relational database to ensure ease of use and ease of expansion.
IBM has supplied what I feel is one of the most powerful relational databases in existence in DB2/400. The company has also supplied very powerful and easy-to-use database manipulation devices in the form of RPG/400, COBOL/400, and SQL/400. How you use these tools to make your lives easier is up to you. You can build your database with the future in mind, or you can paint yourself into a corner by not thinking ahead.
Breaking the Database Down into Manageable Units
As the industry has progressed from using flat files to using relational database architecture, one thing that has certainly evolved is the way files are broken down into their unique elements. For example, order entry information was commonly stored in a single flat file containing both header and detail information in each record. Each line of this file could include static information for invoices, for example, in the form of customer number, invoice number, and order date as well as detail information such as item and quantity ordered.
In a relational database, this information is broken down into header and detail files. One header record appears for one or more detail records. The key for the detail file contains a unique identifier for the header information, such as invoice number, followed by the detail line number. But what is stopping you from taking this to the limit? Only your imagination and database design capabilities.
Suppose your users request the ability to input special instructions for each detail line. How would you handle this? A simple solution may be to put four 40-character fields
in the detail file called Text1, Text2, Text3, and Text4. For input, you simply display a window with the four fields into which the user can enter the special instructions.
Now, this may work fine or you may find yourself in trouble if the user later says that there is not enough room for some of the instructions in only four 40-character fields.
A better solution may be to build an order detail description file one level below the detail file, keyed by order number, detail line number, and special instruction sequence. For each detail record, there can be at least one special instruction detail line. Then, users can enter as many instructions as they want in a window subfile, instead of being limited to a window that allows only four lines of text (see Figure 1).
On the other hand, there are times when adding multiple and relatively similar fields to a file will be a good choice. One example of this would be any monthly figure, such as monthly sales. We all know that there will never be more than twelve months in a year, so creating a separate detail file to hold these monthly sales files may be overkill. If these fields are defined as 12 different elements in the file, they can also be easily renamed into an array for easy manipulation. (For more information, see Common Sense Normalization, MC, September 1995.)
Mastering Master Files
The design phase usually yields master files, such as Customer Master File or Item Master File, to hold static information about these unique entities. Creating master files for this type of information is more obvious, since they are the base of the entire database. But, all too often, no master files are created for other, less obvious information.
Take, for example, a shipping method code in an order header file. You may have such codes as F to represent Federal Express shipment and U to represent UPS shipment. These codes are most likely explained in the text portion of the field definition in the DDS. For reporting purposes, the SELECT statement in each RPG program translates each code into a description thats more meaningful to the user than the code itself. But what happens when you add a new shipping-method code? You most likely wont change the DDS text for the field right away, if at all, but you do have to go in and add a line to the SELECT statement in all the reports and displays using this field.
A better method is to create a master file for shipping-method codes. In this file, include the code, the code description, and any other static information associated with a shipping method. Then, when you implement a new shipping method, you simply add a record to this master file. Users can also easily perform this task by using an entry/update program, so no programmer assistance is required. In addition, accessing the code description or any other code-related information is as easy as a CHAIN to the master file. This method also makes data validation more accurate, because you can perform a simple CHAIN to the master file to determine whether a user has entered an invalid shipping method code. It also allows the output of reports to be more descriptive, as the user will see the code description instead of just the code (see Figure 2).
Your applications will be more versatile if you use master files for each type of code in your database. Even if you think a code will be only a simple Y (yes) or N (no), setting up a master file for it eliminates rewriting applications that use this field if, by chance, Y and N are suddenly not the only two choices. Of course, sometimes a simple Y or N is all you will need. Since RPG III limits you to 50 files in a program, grouping these types of fields by application into a control file may be your best solution.
Another plus to creating master files for codes is the ability to display a list of codes during the entry portion of an interactive application. The user simply prompts on the code field and selects from the list. This will be especially helpful for new employees who dont know the code structure and want to know which shipping method code to use.
Physical Filesto Key or Not to Key
Ever since I entered this business, Ive heard arguments both for and against the keying of physical files. But Ive never heard a really good argument against it or a really good argument for it. On the one hand, the key used on a physical file is usually unique and wont have to be changed. On the other hand, by using a logical with the unique key you would have used on the physical, you may save time if the key to the database changes.
As an example, lets look at an Item Master File. The item number in the database will always be a unique key. You will never have more than one item with the same number, so why not key the physical file by item number? Theres a good reason not to do it that way.
One day, circumstances may require that you change the key to all of your files. For example, perhaps your company merges with another, and the new entity wants to use your existing software but remain separate from the original company. This new company may also want to set up items with the same item numbers as the existing company. You may think about using two different libraries to separate the data, but youd quickly realize that you would be limiting yourself in the long run. After all, what if another company forms? The best solution is to add a company number field to your files, which will always be the lowest-level key. So, in the case of the Item Master File, the unique key will be company number, then item number.
You now realize why uniquely keying your physical file may create problems. If you simply leave the physical unkeyed, you have to change only the logical file that contains the unique item number key, thereby avoiding the song and dance of copying the data to a temp file, creating the new physical, and copying the data from the temp file to the new file with CPYF, specifying FMTOPT(*MAP *DROP).
Referencing Fields for Easy Maintenance
After youve designed your databaseand before you start entering DDS for your filescreate a field reference file to help organize the multitude of fields. A field reference file is a simple concept and a neat way to organize your fields. Create this file the same way you would create a physical file. Include all the fields from your database design, and their sizes and descriptions. Adding default editing and column headings will also help when using Query/400. Compile this file with the MBR(*NONE) option, since this file will not contain any data. Indicate the reference file using the REF(lib/file) keyword in your DDS, and reference each field using the REFFLD (field) keyword (see Figure 3). The reference file will never contain data; its just a better way to keep your field definitions all in one place.
Now, when I say create a reference field for every field in every file, I dont mean that literally. What I mean is that you should create a field for every unique field. Each invoice number will have only one entry, which will be referenced by the invoice number field in at least your order header and order detail files. This way, if you ever have to make a change to your invoice number, you simply change the reference file and then recompile the DDS for any file that uses that reference field.
Remember that the field reference file is only as helpful as you make it. If you suddenly start referencing fields in this file that were not meant to be used for that purpose, you are setting yourself up for a fall. An example of this might be referencing the invoice number field for the customer number. Theyre both 7,0, so why not? Well, your company grows, and suddenly you need to expand the customer number to 9,0. See my point? If you reference fields only for their intended purpose, things will work out fine. Use judgment, and dont make decisions in haste.
It may be beneficial to make more than one field reference file. Breaking up reference fields by application may help you make changes that affect one application but not another. (For more information, see Build Your Own Data Dictionary, MC, July
Using Selective Logical Files Limits You
I have never been a big fan of using selective logical files. (Selective logical files select only certain data from the physical file. An example would be a logical file that selects from the item master file only items that have a quantity on hand of greater than zero.) For one thing, anytime this file changes, your system may experience a bigger performance hit because of this dynamic selection. Another problem is that the key to this logical file may be the same as the key to another logical file. I have heard of shops that create a new logical file thats specific for the information they want to select when they want to run a reportinstead of using something like Open Query File (OPNQRYF). I have heard of shops that have over 200 logical files over the same physical file. And a lot of those logicals are repetitive because, once a shop has a lot of logical files, its easier to create a new one than to look for one that might satisfy the selection.
Again, you must use your best judgment. There may be cases when using a selective logical file will seem like the right thing to do, but why take that chance? Let your programs do the selecting.
Using Change Physical File
The Change Physical File (CHGPF) command has been updated in V3R2, and it makes our lives much easier. The updates allow you to change a files field descriptions by specifying where to find the new DDS source. To use this command in the most simple case using externally described files, you simply need to input two things: the location of the file to be changed and the location of the DDS to use to change the file.
CHGPF FILE(filename) +
The system takes care of re-creating the physical file, along with any related logical files. In my opinion, it also seems to perform much more quickly than the Copy File (CPYF) command.
(For more information on CHGPF, see The NEW Dynamics of CHGPF, MC, June 1997.)
Measure Twice, Cut Once
Database design is not something to be done in haste. A good, solid, expandable relational database takes forethought and patience. It is easy to be in a hurry to finish a project and take shortcuts. But it takes longer to fix a problem than it does to simply do the job right the first time. The more Band-Aids you use in database design, or even application design, the more your problems will snowball.
Whenever you hear the phrase that will never happen, chances are that it will. Keep the future in sight, and never rule out possibilities simply because some situation has never happened before or because you are in a unique business. Not only will you be making your job more difficult in the future, but you will also be limiting the capabilities of your company.
OS/400 DB2/400 DatabaseAn Overview (SC41-3700-00, CD-ROM QBKAUB01) OS/400 DDS Reference (SC41-3712-00, CD-ROM QBKAUI01)
Order Header File (ORDHDRPF)
A R RORDHDR
A OHCUST 7P 0 TEXT(CUSTOMER NUMBER)
A COLHDG(Customer Number)
A OHPO 15A TEXT(PO NUMBER)
A COLHDG(PO Number)
A OHINV 9P 0 TEXT(INVOICE NUMBER)
A COLHDG(Invoice Number)
A OHORDD 8S 0 TEXT(DATE)
A COLHDG(Order Date)
A OHDUED 8S 0 TEXT(DATE)
A COLHDG(Due Date)
A OHSHPM 2A TEXT(SHIPPING METHOD)
A COLHDG(Shipping Method)
Order Detail File (ORDDETPF)
A R RORDDET
A ODINV 9P 0 TEXT(INVOICE NUMBER)
A COLHDG(Invoice Number)
A ODLINE 5P 0 TEXT(LINE NUMBER)
A COLHDG(Detail Line Number)
A ODITEM 15A TEXT(ITEM NUMBER)
A COLHDG(Item Number)
A ODQTYO 13P 3 TEXT(QUANTITY ORDERED)
A COLHDG(Quantity Ordered)
A ODQTYS 13P 3 TEXT(QUANTITY SHIPPED)
A COLHDG(Quantity Shipped)
Order Detail Special Instructions File (ODSINSPF)
A R RODSINS
A DIINV 9P 0 TEXT(INVOICE NUMBER)
A COLHDG(Invoice Number)
A DILINE 5P 0 TEXT(LINE NUMBER)
A COLHDG(Detail Line Number)
A DIISEQ 5P 0 TEXT(INSTRUCTION SEQUENCE)
A COLHDG(Instruction Sequence)
A DITEXT 40A TEXT(INSTRUCTION TEXT)
A COLHDG(Instruction Text)
Figure 1: Example of order header, detail, and special instructions DDS
Shipping Method Master File (SHPMSTPF)
A R RSHPMST
A SMSHPM 2A TEXT(SHIPPING METHOD)
A COLHDG(Shipping Method)
A SMDESC 30A TEXT(DESCRIPTION)
A SMBCHG 9P 2 TEXT(BASE CHARGE)
A COLHDG(Base Charge)
Figure 2: Use master files, not RPG select statements, to intrepret codes
Reference File (REF)
A R RREF
A CUST 7P 0 TEXT(CUSTOMER NUMBER)
A PO 15A TEXT(PO NUMBER)
A INV 9P 0 TEXT(INVOICE NUMBER)
A DATE 8S 0 TEXT(DATE)
A SHPM 2A TEXT(SHIPPING METHOD)
A LINE 5P 0 TEXT(LINE NUMBER)
A ITEM 15A TEXT(ITEM NUMBER)
A QTY 13P 3 TEXT(QUANTITY)
Order Header File (ORDHDRPF) Using Reference File
A R RORDHDR
A OHCUST R COLHDG(Customer Number)
A OHPO R COLHDG(PO Number)
A OHINV R COLHDG(Invoice Number)
A OHORDD R COLHDG(Order Date)
A OHDUED R COLHDG(Due Date)
A OHSHPM R COLHDG(Shipping Method)
Order Detail File (ORDDETPF) Using Reference File
A R RORDDET
A ODINV R COLHDG(Invoice Number)
A ODLINE R COLHDG(Detail Line Number)
A ODITEM R COLHDG(Item Number)
A ODQTYO R COLHDG(Quantity Ordered)
A ODQTYS R COLHDG(Quantity Shipped)
Figure 3: Example of reference file using order header and detail files