With V5R4, IBM continues its expansion of Structured Query Language (SQL) support inside of DB2 Universal Database (DB2 UDB) for iSeries. These enhancements include additional scalar and aggregate functions, new data types, support for online analytical processing (OLAP), and support for embedded SQL in free-format ILE RPG. In this article, we'll explore these changes and see how they'll make your life easier.
Special Registers and Scalar and Aggregate Functions
SQL's new functionality includes four new special registers. Special registers are areas for storing and retrieving defined values for use within other SQL statements.
The new SESSION_USER and SYSTEM_USER special registers are both used to store and retrieve authorization information. The SESSION_USER special register is equivalent to the preexisting USER special register and has been added for increased compatibility with other SQL implementations. Initially, the value of SESSION_USER will be equal to the SYSTEM_USER special register, which contains the user ID used to connect to the current server. The SESSION_USER special register, however, can be changed at runtime using the SET SESSION AUTHORIZATION statement.
The CURRENT DEBUG MODE special register controls the level of debugging supported when procedures are created. Possible values for this special register are DISALLOW, ALLOW, and DISABLE. A value of DISALLOW indicates that debugging within the Unified Debugger should not be allowed. This value can be overridden by the SET OPTION DEBUG MODE statement on the CREATE PROCEDURE statement. A value of ALLOW indicates that the procedure should be allowed to be debugged within the Unified Debugger. Again, this value can be overridden using the SET OPTION DEBUG MODE statement. The value of DISABLE indicates that the procedure should be created to not allow debugging from within the Unified Debugger and that the SET OPTION DEBUG MODE statement will not override this.
The CURRENT DEGREE special register defines the level of I/O or SMP parallel processing to be used. This value is set using the SET CURRENT DEGREE x statement, where x is either a value from 1 to 32767 or one of several special values. A value of ANY indicates that the system should determine the level of parallel processing. NONE indicates that no parallel processing should be allowed. A value of MAX also allows the system to determine the level of parallel processing, but it assumes that all active memory in the pool can be used. Finally, a value of IO indicates that no SMP parallel processing should be allowed but that any number of IO parallel tasks can occur.
V5R4 also contains several new scalar date functions. ADD_MONTHS can be used to obtain a date that is the result of a given date plus a specified number of months. The first parameter used with this function represents the date to which the months should be added. The value must be a date field, a timestamp field, or a character field that evaluates to a date. The second parameter defines the number of months to be added to the date supplied on the first parameter, as shown below:
The values returned by this statement would be a date value representing the date 2/28/2006 and a date value representing the date 12/1/2005. Note that a negative value can be used to effectively subtract months rather than add them. Also note that if the day portion of the date provided is greater than the last day of the resulting month, the last day of that month is returned.
The LAST_DAY function accepts a single parameter containing a date value, a timestamp value, or a character string that evaluates to a date. This function returns the last day of the month for the provided date, as shown here:
When executed, this statement returns a date value representation of the date 01/31/2006.
The NEXT_DAY function returns the next date associated with a specified day of the week following a provided date. The first parameter identifies the starting date. The second parameter identifies the name of day of the week to be returned. Valid values are the full day name ('MONDAY', 'TUESDAY', etc.) or the proper three-letter abbreviation ('MON', 'TUE', etc.). The example below illustrates this function's use:
When executed, this example returns a date of 2/15/2006, which is the next Wednesday after 2/8/2006.
The VARCHAR_FORMAT function converts a supplied timestamp value or a string representation of a timestamp value into a VARCHAR representation of the proper timestamp format. The first parameter supplies the input timestamp value, while the second parameter identifies the format of the resulting timestamp in VARCHAR format. This value must be specified as 'YYYY-MM-DD HH24:MI:SS'. In this string, YYYY represents the four-digit year, MM represents the two-digit month, DD represents the two-digit day, HH24 represents the two-digit hour in 24-hour format, MI represents the minute portion of the timestamp, and SS represents the seconds portion of the timestamp. Below is a sample of this function's use:
When executed, this statement returns the value 2006-01-01 09:00:00. Note that the leading 0's are inserted in each portion of the timestamp.
The ENCRYPT_TDES function gives the user the ability to encrypt field data within an SQL statement. Like its predecessor, the ENCRYPT_RC2 function, ENCRYPT_TDES accepts a parameter containing the string data to be encrypted in BINARY, VARBINARY, CHAR, VARCHAR, GRAPHIC, VARGRAPHIC, VARCHAR FOR BIT DATA, BLOB, CLOB, or DBCLOB formats. The function also accepts two optional parameters, which are used to supply an encryption password and a password hint. If the password value is not provided, the data is encrypted using the default encryption password, which is set using the SET ENCRYPTION PASSWORD statement. Unlike the existing ENCRYPT_RC2 function, ENCRYPT_TDES uses the Triple DES encryption algorithm. To populate a table field with the result of the ENCRYPT_TDES function, the field must be defined as BINARY, BLOB, CHAR FOR BIT DATA, VARBINARY, or VARCHAR FOR BIT DATA. The length of the field definition must also take into consideration additional bytes required to store the password data. This is accomplished by taking the length of the data to be encrypted, adding 16 bytes for the password, rounding the result up to the next multiple of 8, and finally adding the number of bytes required for the password hint.
The GENERATE_UNIQUE function generates a unique 13-byte bit character string. This unique value is generated using a combination of the Universal Time Coordinated (UTC) and the iSeries serial number. The value returned will be unique each time the function is executed. A column within a table that contains a GENERATE_UNIQUE value must be defined as CHAR(13) FOR BIT DATA. The resulting value can also be used along with the TIMESTAMP function to determine the time when the data was updated.
The RAISE_ERROR function allows a user to return a specified error. The first parameter contains a five-character string that evaluates to the SQL State associated with the error. The second parameter contains a string with a maximum length of 70 bytes, which contains a diagnostic string related to the error. This statement is generally used within SQL procedural language. It cannot be used in circumstances where parameter markers are not allowed, such as from within interactive SQL, without using a CAST specification to define a data type for the function's results. This is because the RAISE_ERROR's data type is undefined.
In addition to these scalar functions, aggregate functions have been added. The STDDEV_SAMP function returns the sample standard deviation of a set of numbers. This value is basically the result of this formula:
That is the square root of the sample variance for the set of numbers. The STDDEV_SAMP function accepts a single parameter containing the numeric value on which the sample standard deviation is to be calculated.
The new VAR_SAMP function, shown above, calculates the sample variance of a set of numbers. As shown earlier, the single parameter identifies the numeric field on which the VAR_SAMP results are to be based. Below is the formula used to calculate the sample variance:
Online Analytical Processing (OLAP)
OLAP functionality has been added to SQL in V5R4. This ability extends the existing aggregate functionality within SQL by adding ranking and row numbering capabilities. This functionality allows for greater ability to analyze data using only SQL. The code below shows the syntax used to accomplish this:
RANK() OVER(PARTITION BY part defn ORDER BY sort order [ASC, DESC]) AS RANK_VAL,
DENSE_RANK() OVER(PARTITION BY part defn ORDER BY sort order [ASC, DESC]) AS DNS_RNK_VAL,
ROW_NUMBER() OVER(PARTITION BY part defn ORDER BY sort order [ASC, DESC]) AS ROW_VAL
FROM table name
The RANK and DENSE_RANK specifications allow you to retrieve a ranking within a defined group of fields for the current record. The field definition is based on the list of fields defined on the ORDER BY clause. The PARTITION BY clause can also be included to define partitioning of the ranking data. The statement below gives an example of this:
RANK() OVER(PARTITION BY DIVISION ORDER BY SALES$ DESC) AS SALES_RANK
ORDER BY DIVISION, RANK() OVER(PARTITION BY DIVISION ORDER BY SALES$ DESC)
In this example, the data will be ranked by sales dollars within each division. The numbering returned will represent the ranking within the PARTITION BY value specified. Note that we've used the RANK() specification again within the ORDER BY clause for our SELECT statement to indicate that the result set should be sorted by the sales dollars. The way that ranking is determined when the ORDER BY values are identical depends on whether you use RANK or DENSE_RANK. While RANK will assign identical values the same rank, it also increments the "rank counter," meaning that if the top five records all have the same ORDER BY value, they will all appear as the number 1 ranking, but the sixth record will have a ranking of 6. If DENSE_RANK is used, the first five records will still have a ranking of 1, but the next record will have a ranking of 2.
The ROW_NUMBER specification is similar to the RANK specification. It returns the sequential row number within the defined partition, based on the ORDER BY clause. ROW_NUMBER, however, does not group like values as the RANK and DENSE_RANK specifications do. Below is a sample of using ROW_NUMBER:
ORDER BY DEPT, LNAME
This example displays a row number within a DEPT group based on the value of the LNAME field. The ASC and DESC modifiers can be added to the ORDER BY clause to identify ascending or descending order. The optional NULLS FIRST and NULLS LAST values can be specified to force any null values to the beginning or end of the result set. The ORDER OF clause can be used in place of the ORDER BY clause to identify that the data set should be ordered using the sort order of the table specified.
ORDER BY DEPT, LNAME
In this example, the value PAYROLL identifies that the order of our data should be the same sort order defined on the PAYROLL table.
Free-Form ILE RPG Embedded SQL
If you're an "embedded SQL junkie" like I am, this is the language enhancement you've been waiting for. Prior to V5R4, in order to execute an embedded SQL statement from a free-form program, you had to exit from free-form using the /END FREE compiler directive and then enter embedded SQL using the /EXEC SQL compiler directive. In V5R4, IBM has simplified the process and in turn made embedding SQL into an RPG program even more seamless. Below is a snippet of code showing how a free-form ILE RPG program with embedded SQL would have to be coded prior to V5R4:
SQL_Str = 'SELECT NAME FROM CUSTOMERS , ' +
' WHERE STATE= ? AND ACTBAL>0 ';
C+ PREPARE SQL_STMT FROM :SQL_Str
C+ DECLARE CUST_Crsr CURSOR FOR SQL_STMT
C+ OPEN CUST_Crsr USING :STCD
C+ FETCH CUST_Crsr INTO :CUSTNAME
DoW SQLCOD <> 100 and SQLCOD >= 0;
You'll notice that each instance of switching from free-form to SQL requires at least two additional lines of code, not to mention the fact that we need to use the /EXEC SQL and /END-EXEC directives for each SQL statement we want to execute. This means that a single line of SQL takes up three lines of code. In V5R4, IBM has rectified this problem. The code below shows how the previous example can be coded in V5R4:
SQL_Str = 'SELECT NAME FROM CUSTOMERS , ' +
' WHERE STATE= ? AND ACTBAL>0 ';
EXEC SQL PREPARE SQL_STMT FROM :SQL_Str;
EXEC SQL DECLARE CUST_Crsr CURSOR FOR SQL_STMT;
EXEC SQL OPEN CUST_Crsr USING :STCD;
EXEC SQL FETCH CUST_Crsr INTO :CUSTNAME;
DoW SQLCOD <> 100 and SQLCOD >= 0;
As you can see, this much simpler method of coding embedded SQL not only reduces the number of code lines required, but actually makes the code much easier to read. Once again, this indicates IBM's commitment to the further use of SQL as well as the continued development of the RPG language.
DB2 UDB and SQL
With the continued evolution of SQL on the iSeries, IBM has shown its eagerness to keep DB2 UDB for the iSeries in line not only with versions of DB2 for other platforms, but also with other implementations of SQL. One can only wonder what goodies they'll have for us in the next release.