It was another typical Monday. Three users had changed their passwords on Friday, two toner cartridges needed changing, and there was one request for advice on how to buy a PC. Then Fred called from promotions. “I need a report that shows sales quantity by product, by month, for the last 12 fiscal months,” he said. “You know, a separate line for each item, with sales totals across the page.”
“OK,” I replied, “I can do that.” “I might need to select only certain product types,” he continued, “and sometimes I might only want items with sales under a certain level, and I might want to look at only a single customer or a single sales territory, so can you make it really flexible? Oh, yeah, and I might want to insert the data into a word processing document. Will that be done today?
It’s really important, and I need it right after my golf game.”
“Sure thing, Fred,” I managed to respond, as I wondered if that coffee spray would hurt my monitor.
“Have a nice day,” he crooned. “Gotta go.” Fred thinks the programmer lives in the cabinet under the console.
I collected my wits for a few moments and then got down to business. I decided to give Fred a spreadsheet. That would be flexible enough. He could control when the data gets refreshed. He could filter, sort, and subtotal it any way he wanted, and he could include it in a word-processing document. The raw data would be in the order detail table. But wait a minute, he wants monthly totals by item, with a separate column for each month. That’s a cross-tab. The spreadsheet just got a lot more complicated.
The whirring and clanking of the creative process filled the air, punctuated by the acrid smell of burning coffee from the overheated, empty pot. Self-preservation is a powerful motivator.
To fulfill all of Fred’s criteria, I decided I needed to build an SQL query with a virtual pivot table, which would keep the spreadsheet simple and cut down on network traffic.
“Whoa!” you say. “What’s a virtual pivot table?” Well, here’s the executive overview.
Transactional data is generally stored in a normalized database. A row usually contains an identifying key, such as a document number, and fields that describe the
transaction, such as date, quantity, related revenue and cost information, or maybe even a journal reference and fiscal period. Each row relates to only one transaction.
It’s often easier to interpret the data when a cross-tab format is used to display it. It’s called a cross-tab because related data spreads across, instead of down, the page. A sample cross-tab row might contain an item number and quantity or revenue information by month for each month in the fiscal year. A virtual pivot table denormalizes the data by making it appear as if there is an array of buckets in the database. It’s called a pivot table because (you guessed it) it pivots the orientation of related data from vertical to horizontal.
OK, so back to the case at hand. I know I need figures summarized by item, and I might want to break that apart by customer or sales territory, so my level of granularity will be item within customer within sales territory. If Fred doesn’t care about the customer or sales territory, he can handle that in his spreadsheet. So far, so good. I know that SQL has a SUM function, but how can I get that to add things up by fiscal month?
CASE to the Rescue
IBM calls the answer a computer-aided systems engineering (CASE) statement. I prefer to think of it as a function because, like many SQL statements, it returns a value. This is an extremely powerful characteristic because, as you shall see, it means that CASE statements can be nested inside other functions.
To begin with, I can use a CASE statement for each month to assign the current quantity (QTYSAL) to the correct monthly column. Each row of raw data will match the criteria of exactly one of the CASE statements. The rows that don’t match will each receive a value of zero.
CASE FSCMON WHEN 1 THEN QTYSAL ELSE 0 END, -- Month 1
CASE FSCMON WHEN 2 THEN QTYSAL ELSE 0 END, -- Month 2
...
CASE FSCMON WHEN 12 THEN QTYSAL ELSE 0 END -- Month 12
If I were using a column of the date data type instead of a numeric month field, I would use the MONTH( ) function to extract the month.
Now I can combine CASE with SUM to get the conditional summarization I need. The total of all QTYSAL values from rows where FSCMON = 1 will be retrieved in one result value when I specify the following:
SUM(CASE FSCMON WHEN 1 THEN QTYSAL ELSE 0 END), -- Month 1
I should probably also specify the output format, since the default column size is really only useful if you’re a quantum cosmologist. I’d like a 15-digit packed-decimal value, with four digits after the decimal point, please:
DECIMAL(SUM(CASE FSCMON WHEN 1
THEN QTYSAL ELSE 0 END),15,4), -- Month 1
Now, there’s just one more detail. Remembering that a little programming paranoia is a healthy thing, if I decide to use my view in a left outer join at some point, I should make sure that zero is returned for items that have no transactions. I could do this with a second CASE statement, but I prefer the older and more compact IFNULL or COALESCE functions. Actually, I like IFNULL best because the name tells me what it does. COALESCE sounds vaguely like something that must be requested in triplicate and approved by committee. Both functions will return the first value that is not null.
When I add IFNULL to my construct, it looks like this:
DECIMAL(IFNULL(SUM(CASE FSCMON WHEN 1
THEN QTYSAL ELSE 0 END),0),15,4), -- Month 1
Creating the View Source
SQL does not compile. I can, however, store a single SQL command in a source member and execute it with the Run Structured Query Language Statement (RUNSQLSTM) command. I end up with something roughly analogous to a traditional compile, except that, with RUNSQLSTM, I can do many other things, too. If you don’t have the SQL licensed program product, don’t worry. There are other options. I’ll cover that later.
Take a look at Figure 1. It’s a typical script. Think of it as the source for the view. In section A, I declare my intentions. I’m going to create a view, which is really just an arrival sequence logical file. I’m going to call it FREDSVIEW and put it in library VIEWLIB. I like to use system naming conventions, but you can use the International Organization for Standardization (ISO) convention of name.collection if you prefer. RUNSQLSTM will accept either.
In section B, I list the view’s column names. These would be the alias names in DDS, and they’re what ol’ Fred is going to see, so they’re fairly verbose. Note that the list is enclosed in parentheses.
I get to the SELECT clause in section C. It identifies the columns and columnar functions that will provide the data. There must be exactly one column listed in the SELECT clause for every entry in the column name list. This is where I put the cross-tab incantations over which I’ve just finished laboring so diligently.
I tell SQL which tables contain the data in the FROM clause in section D. In this case, everything is in the INVCDETL table, but it’s usually necessary to code a few joins here.
And finally, in section E, I have the GROUP BY clause. It defines the control breaks that SQL will use when summarizing the data. Here, I am required to list all of the data elements from the SELECT clause that are not being summarized in some way. This means all of the columns except the monthly total values. Sometimes this list gets quite long and seems like a very inefficient use of system resources. Since I always try to write efficient code, I might be tempted to fool SQL by using a MAX or MIN function on columns that aren’t really group control values. I would pay a heavy penalty in response degradation and CPU usage if I did that. SQL has no sense of humor.
I’ve saved my source member as FREDSVIEW, and all I need to do now is to run the script.
RUNSQLSTM? Ernie Should Have Copyrighted It
I’m not talking about the freeware utility that MC published back in the Dark Ages, although it will work if you don’t have SQL on your AS/400. You might also create a Query Management Query (CRTQMQRY) from your source and run that with the STRQMQRY command, or you could use the EXCSQLSTM command that appeared in the December 1994 issue of MC (see “The EXCSQLSTM Utility”). If you’d prefer to keep the source on your PC or a network drive, you could also use the query script editor in Operations Navigator (right click on Database and select Run SQL Scripts).
I’m talking about the true-blue RUNSQLSTM command. The options you select when you run it will often have an impact on how your SQL object behaves. For instance, when you create a stored procedure, you need to specify the date format and user profile that will be used at runtime. I don’t care about that right now, but if I selected the correct combination of options, I could standardize and use it for everything. Take a look at Figure
2. Obviously, I need to tell RUNSQLSTM the name of the source file (PETESLIB/MYSQLSRC) and source member (FREDSVIEW). I always use commitment control in stored procedures, so I specify COMMIT(*CHG), which is also the default. I like *SYS naming (LIBRARY/FILE) and *ISO dates (yyyy-mm-dd). I want to *RUN the script as opposed to just checking it for syntactic correctness. I want the system to allow
the copying of data if necessary (*OPTIMIZE), and I’d like the default user profile (this only applies to stored procedures) to be the requester. If you’re lazy like I am, you might want to create a PDM user option to make things really easy in the future. Start PDM (STRPDM) and select option 9, Work with user-defined options, press F6 = Create, type in a convenient code, and enter the command. You can even use the command prompter by typing RUNSQLSTM and pressing F4 = Prompt.
So, without further ado, give it the old smoke test.
The Object Emerged as the Mists Slowly Cleared
RUNSQLSTM produces a nice diagnostic report for you, so if the command doesn’t execute successfully for some reason, it’s not too difficult to tell why. Mismatched view column and select column entries and missing or mismatched punctuation are common errors. This time, we did everything correctly the first time. Fred can stop worrying. You know how that throws off his golf game.
Just to be sure everything was OK, I ran a couple of quick queries. First, I got a list of all of the SKATEKEY items that were sold to customer 113242 in 1999 and pressed the print key (Figure 3). Then I requested the same list from FREDSVIEW. SQL returned one row, which I have pasted on several lines because you can’t see it when I scroll my monitor to the right (Figure 4). Notice that the raw data contains no transaction for month 7 and two transactions for month 8. The view has zeros in QTY_SOLD_MONTH_07 and has added both of the month 8 rows into QTY_SOLD_MONTH_08. All I need now is a simple spreadsheet, and I’ll be home before the WWF gets started. Isn’t it incredible that we can get paid for having this much fun?
A Few Tips and Gotchas
Virtual pivot tables require system processing overhead. They are very efficient because they require only one pass through the data, but the values that your pivot table returns are not stored. They’re calculated at runtime.
After you build any new view, turn on debug and query it interactively a few times with some common WHERE clauses. Look at the job log and see if the query optimizer has asked you to build any indexes. Having the correct indexes can make a big difference in performance. A poorly performing query can make your whole system sluggish.
You can use mixed-case and even embedded blanks in your object names, and they can be up to 128 characters in length. The AS/400 handles this just fine, as do some—but not all—of the third-party products that access your AS/400 data with ODBC. Microsoft Query is inconsistent. Sometimes it allows mixed-case names enclosed in quotes, and other times it doesn’t. Seagate Software’s Crystal Reports handles mixed-case names just fine but has trouble with embedded blanks. Seagate’s online analytical processing (OLAP) tool doesn’t work with mixed-case, quotes, or embedded blanks. Mixed-case can greatly enhance readability, but if you want to be absolutely safe, stick to uppercase 10-character names.
REFERENCES AND RELATED MATERIALS
• Optimizing Transact-SQL: Advanced Programming Techniques. David Rozenshtein, Eugene Birger, and Anatoly Abramovich. Fremont, California: SQL Forum Press, 1995
• The Essence of SQL: A Guide to Learning Most of SQL in the Least Amount of Time. David Rozenshtein Ph.D. Fremont, California: SQL Forum Press, 1998
CREATE VIEW VIEWLIB/FREDSVIEW
(ITEM_ID,
TERRITORY,
CUSTOMER_ID,
FISCAL_YEAR,
QTY_SOLD_MONTH_01,
QTY_SOLD_MONTH_02,
...
QTY_SOLD_MONTH_12)
AS SELECT
ITMNBR,
SLSTRT,
CUSTNO,
FSCYR,
DECIMAL(IFNULL(SUM(CASE FSCMON WHEN 1
THEN QTYSAL ELSE 0 END),0),15,4), -- Month 1
DECIMAL(IFNULL(SUM(CASE FSCMON WHEN 2
THEN QTYSAL ELSE 0 END),0),15,4), -- Month 2
...
DECIMAL(IFNULL(SUM(CASE FSCMON WHEN 12
THEN QTYSAL ELSE 0 END),0),15,4) -- Month 12
FROM INVCDETL
GROUP BY
ITMNBR,
SLSTRT,
CUSTNO,
FSCYR
A D
E
B
C
Figure 1: Once you know how, constructing the source is easy.
RUNSQLSTM SRCFILE(PETESLIB/MYSQLSRC)
SRCMBR(FREDSVIEW)
COMMIT(*CHG)
NAMING(*SYS)
DATFMT(*ISO)
DATSEP(-)
PROCESS(*RUN)
ALWCPYDTA(*OPTIMIZE)
USRPRF(*USER)
DYNUSRPRF(*USER)
Figure 2: Create the view with RUNSQLSTM.
INVNBR INVLIN ITMNBR QTYSAL UPRICE CUSTNO SLSTRT FSCYR FSCMON SLSJRN
55,121 1 SKATEKEY 500.0000 17.4320 113,242 115 1,999 1 INV62949 56,424 1 SKATEKEY 295.0000 17.4320 113,242 115 1,999 2 INV63841 57,292 1 SKATEKEY 621.0000 17.4320 113,242 115 1,999 3 INV68121 58,104 1 SKATEKEY 550.0000 17.4320 113,242 115 1,999 4 INV69101 59,317 1 SKATEKEY 319.0000 17.4320 113,242 115 1,999 5 INV69914 59,821 1 SKATEKEY 604.0000 17.4320 113,242 115 1,999 6 INV70153 61,002 1 SKATEKEY 515.0000 17.4320 113,242 115 1,999 8 INV76929 62,814 1 SKATEKEY 522.0000 17.4320 113,242 115 1,999 8 INV77484 63,181 1 SKATEKEY 504.0000 17.4320 113,242 115 1,999 9 INV79617 64,209 1 SKATEKEY 485.0000 17.4320 113,242 115 1,999 10 INV80112 65,315 1 SKATEKEY 546.0000 17.4320 113,242 115 1,999 11 INV91115 66,444 1 SKATEKEY 509.0000 17.4320 113,242 115 1,999 12 INV93114 select * from invcdetl where itmnbr='SKATEKEY' and custno = 113242 and fscyr = 1999 order by invnbr
Figure 3: List the test data set.
ITEM_ID TERRITORY CUSTOMER_ID FISCAL_YEAR
SKATEKEY 115 113,242 1,999 QTY_SOLD_MONTH_01 QTY_SOLD_MONTH_02 QTY_SOLD_MONTH_03 500.0000 295.0000 621.0000 QTY_SOLD_MONTH_04 QTY_SOLD_MONTH_05 QTY_SOLD_MONTH_06
550.0000 319.0000 604.0000 QTY_SOLD_MONTH_07 QTY_SOLD_MONTH_08 QTY_SOLD_MONTH_09 .0000 1,037.0000 504.0000 QTY_SOLD_MONTH_10 QTY_SOLD_MONTH_11 QTY_SOLD_MONTH_12
485.0000 546.0000 509.0000
select * from fredsview where item_id =’SKATEKEY’ and customer_id = 113242 and fiscal_year = 1999
Figure 4: Compare the view with the raw data.
LATEST COMMENTS
MC Press Online