ID fields are a powerful way to organize your data models, and SQL provides all the tools you need to take advantage of them.
In my previous article, I talked about using two files to monitor the SQL activity on my system. One had the actual unique SQL data statements while the other held more-granular data for each statement: information about the jobs that used the statement, the user ID, the IP address, and so on. If I had that all in one table, then I'd be repeating some really large SQL statements over and over again. The idea instead was to create a unique ID field for each new SQL statement and track all the data using that ID as a key.
Primary Keys, the Great Debate!
Before I head into my table design, I just want to introduce a question that relates directly to today's topic: How do you define the primary keys for your database files? The primary key is the one you use to get a unique record in the table; in our RPG world, it's what we use to CHAIN to a master file. It seems simple: The key to the customer file is the customer number, right? Well, as it turns out there are two schools of thought as to how to define a primary key. First is the concept of a "natural key." This is the one most of us are familiar with. For example, using the customer number as the key to the customer table.
However, an alternative exists called an "artificial key" (or sometimes "surrogate key"). This is just a big old integer number that is incremented every time you add a new record. The cool thing about the artificial key is that you can then change the natural key without having to change any other records in your database. For example, let's say you have hundreds of transactions and orders and so on for item ABC, but a company reorganization requires you to change the name of item ABC to XYZ. Well, if you had ABC as the key to all those files, you'd have to change every occurrence of the value ABC to XYZ. But what if every record in the item file had a unique numeric ID? Let's say item ABC was the tenth item you ever added, so it's unique ID is 10. The item record would have a primary key of 10, and of course it would also have an item number field with the value ABC. However, any subsidiary records that pointed to item ABC wouldn't have "ABC" in the record. Instead, they'd have a field called ID_ITEM, and it would have 10 in it. Now, all you have to do is go into the item record and change the item number from ABC to XYZ. You don't have to change a single additional record.
Using artificial keys isn't something to be undertaken lightly; they add complexity especially in a non-SQL environment. There are other considerations as well, especially having to do with historical data, but those are more application-oriented. I'm not advocating for a wholesale move to artificial keys but instead just explaining the concept because that's what we're going to use for our SQL tables.
Step 1: Using an Identity Column
The first step of this process is creating and using a file with an identity column. The logic is simple: I try to retrieve the ID that corresponds to the SQL statement I have, and if none is found I add a record with the new SQL statement and return the ID of the new record. To do that, I need two things: the DDL that creates the file, and the code that uses that file. Let's start with the DDL.
-- Create statement file
create or replace table SQLSTM (
SSID numeric(5) generated always as identity
(start with 1 increment by 1),
SSSTMT varchar(500) not null with default,
primary key(SSID)) rcdfmt SQLSTMR;
You'll note my personal preference for SQL statements. I'm an old-school RPG programmer, so I still tend to think of my fields as all uppercase. Heck, I even went with six characters for this particular project (don't judge me!). Because I like the uppercase field names, I then leave everything else lowercase. That helps the schema, table, and column (library, file, and field for us DDS dinosaurs) names to stand out.
Technically, I could have used long mixed-case names for all of this, but I find there are a number of issues that come into play with unmanaged use of long names. But that's an entirely different topic that we can get into in more detail in another article.
Let's instead review the specific syntax that supports the generated identity field that we will use for our artificial key. The syntax is quite simple. First, define the field as you would any other numeric value. In this case, I defined my SSID field as type numeric to create a zoned decimal field. I specified 5 digits because I don't expect to ever have over 100,000 unique SQL statements in this particular monitor. The next clause is the meat of the definition: generated always as identity. This says the field is an identity field. You can generate other types of fields, such as timestamps, but identity is what we need for this particular case. The field will automatically get incremented each time you add a new record. The final part of the statement (start with 1 increment by 1) is pretty self-explanatory; it says start with 1 and add 1 for each new row.
The final two nuances: I define SSID to be the primary key for the file (which creates an index over that field), and I also change the record format name. That's because the default is to name the record the same as the file name, which doesn't work particularly well in RPG programs, especially older RPG programs.
Step 2: Creating the Detail File
After all of that, creating the detail file is sort of an anticlimax.
-- Create job file
create or replace table SQLJOB (
-- Fully qualified job name
SJFQJN char(28) not null with default,
-- Parsed job name
SJJOB char(10) not null with default,
SJUSER char(10) not null with default,
SJJOBN char(6) not null with default,
-- User ID and IP address
SJUSID char(10) not null with default,
SJCLIP char(15) not null with default,
-- Unique ID of SQL statement
SJSQID numeric(5) not null with default,
SJCPU decimal(20) not null with default,
SJDISK decimal(20) not null with default,
-- Audit information
SJCRTS timestamp not null with default,
SJLCTS timestamp not null with default,
SJCHGS numeric(6) not null with default,
-- Primary key and record format
primary key(SJFQJN, SJUSID, SJCLIP, SJSQID)) rcdfmt SQLJOBR;
The file is very straightforward. The primary key is the fully qualified job name, the user ID, the IP address, and the SQL statement ID. The job fields are parsed out into separate fields as well because the SJFQJN field is the traditional job format with the embedded slashes, such as 123456/JPLUTA/MYJOB. The other fields are the accumulated statistics and the timestamps. As an additional note, I'm considering using one of the generated variations to handle the last-change timestamp. I’ll let you know how that works out.
Now we have the statement master and detail files. In the next article, I'll show the programming logic required to maintain these tables. Enjoy!