26
Fri, Apr
1 New Articles

TechTip: A Hidden Tax with SQL Programs and SQL Packages

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

If your environment shares one set of embedded SQL programs or SQL packages among multiple library lists, you may be paying a performance penalty for access plan rebuilds.

Thanks to the spectacular library list concept, it's common to have multiple instances of application data libraries on the IBM i. Further, a typical IBM i application stores its programs in one or more object libraries, separate from the data libraries. This configuration allows a single set of application programs to operate on multiple, independent data libraries.

 

Developers who know the power of the library list perform due diligence to make sure their SQL code can run over multiple sets of libraries (such as dev, test, and multiple production libraries) so that only one copy of the program is required. However, if you share SQL code containing prepared statements (such as embedded SQL, persistent stored modules (PSMs, aka SQL routines), prepared dynamic statements stored in an SQL package, etc.), there may be a subtle performance tax lurking.

 

For the record, the term program used herein refers to any DB2 for i embedded SQL program that contains SQL statements that have a reusable access plan and use the system (*SYS) naming convention. This list includes SQL-based triggers, functions, and stored procedures because, under the covers, they are created as C embedded SQL programs.

 

To illustrate the hidden tax problem, consider the following simple application library lists:

 

Application Library List #1

QTEMP

APPDATA (data library for company #1 containing tables, views & data areas)

APPOBJ (program object library including embedded SQL programs)

QGPL

 

Application Library List #2

QTEMP

APPDATA2 (data library for company #2 containing tables, views & data areas)

APPOBJ   (program object library including embedded SQL programs)

QGPL

 

The application programs in library APPOBJ are shared in common and can run against one of two data libraries: APPDATA and APPDATA2. A new embedded SQL RPG program called ORDERSR is introduced into APPOBJ and is called with library list number #1 in play. The program contains one prepared SQL query.

 

After the program finishes, you execute the Print SQL Information (PRTSQLINF) command against the program object (PRTSQLINF ORDERSR *PGM). Inside the command's spool file output, you find an SQL4021 message like this:

 

SQL4021 Access plan last saved on 02/09/14 at 14:21:22.

 

A saved access plan is a good thing because the next time the program runs, DB2 normally won't take the time to build it from scratch. If you run the program again with the same library list, DB2 will not resave the access plan associated with the SQL statement in the program.

 

Next, you change the job's library list to use library list #2 (with the data library changing from APPDATA to APPDATA2) and call the program again. You run PRTSQLINF again and see message SQL4021 again, but this time with a more recent timestamp than the original. Something like this:

 

SQL4021 Access plan last saved on 02/09/14 at 14:24:30.

 

What happened? Why was the access plan saved again?

 

The answer is because DB2 has detected that the tables it's now querying with library list #2 are different from the ones it started with in the original access plan. Even though the tables and indexes may indeed be structurally the same between libraries APPDATA and APPDATA2, DB2 (for various reasons) has to rebuild the query's access plan.

 

The point is that the rebuild process takes time for DB2 to complete. For programs that are called relatively infrequently, it's probably not a big deal. But, if your environment features the same SQL programs being called rapidly and repeatedly against several alternative data libraries, there is a tax on DB2 to constantly rebuild and save the access plans.

 

In this situation, one possible solution would be to create separate program libraries for SQL objects (a companion SQL program library to each data library). This would allow each data set to have its own access plans without the cost of rebuilding the access plans. The drawback of this configuration is that every time a program is created or changed it needs to be deployed to multiple libraries.

 

For a second solution, IBM's Scott Forstie said his approach to tackle this problem would be to change the static SQL program to use dynamically prepared SQL statements (using PREPARE rather than EXECUTE IMMEDIATE). Apparently, this will perform better, and the only drawbacks are that dynamic SQL statements can be a problem to manage and you lose the convenience of having newer versions of DB2 for i automatically catalog your static statements (see below). Thanks for the advice, Scott!

Using the New Program and Package Statement Views in IBM i 7.1 TR8

So how do you know if you have this access plan rebuild problem? It's hard to tell, especially because PRTSQLINF is limited in what it reports about access plan updates. With PRTSQLINF, you can tell an access plan was updated recently (but not how frequently.)

 

Starting in IBM i 7.1 TR8 (group PTF level 29), two new catalog views were added to the QSYS2 schema: SYSPROGRAMSTMTSTAT and SYSPACKAGESTMTSTAT. These views are intended to return information about individual statements within an embedded SQL program/PSM or SQL package. The views contain a row for all static statements stored in all programs and packages on the IBM i. Of interest to this topic is a column called NUMBER_REBUILDS that can be used to determine whether DB2 is spending precious time rebuilding an access plan.

 

To illustrate how to find unwanted access plan rebuilds, consider an SQL procedure that runs a prepared SQL statement. When this procedure is run alternately against multiple data libraries, the SYSPROGRAMSTMTSTAT view can be queried to check whether DB2 is rebuilding the access plan:

 

Step 1: Build an SQL stored procedure with a prepared SQL statement:

 

CREATE OR REPLACE PROCEDURE DEV.ORDERS_SQL

(IN StartDate TIMESTAMP,

IN EndDate   TIMESTAMP,

IN CustomerId INT)

LANGUAGE SQL

SPECIFIC DEV.ORDERS_SQL

DETERMINISTIC

RESULT SETS 1

SET OPTION DATFMT=*ISO,USRPRF=*OWNER,COMMIT=*NONE

BEGIN

-- NOTE: Make sure that *SYS naming is used when creating procedure

DECLARE Orders CURSOR WITH RETURN TO CLIENT FOR

SELECT soh.SalesOrderId,CarrierTrackingNumber,OrderQty,

       UnitPrice,UnitPriceDiscount,soh.ModifiedDate,

       Cast(ORDERDATE AS DATE) OrderDate,          

       Cast(DUEDATE AS DATE) DueDate,

       Cast(SHIPDATE AS DATE) ShipDate,                        

       Status,

       PurchaseOrderNumber,

       AccountNumber,CustomerId,

       p.Name AS ProductName,p.ProductNumber,p.Color

FROM SalesOrderDetail sod

JOIN SalesOrderHeader soh On soh.SalesOrderId=sod.SalesOrderId

JOIN Product p On p.ProductId=sod.ProductId

WHERE soh.OrderDate BETWEEN StartDate AND EndDate

   AND soh.CustomerId=CustomerId

   FOR READ ONLY;

 

   OPEN Orders;

 

   SET RESULT SETS CURSOR Orders;

END

;

 

Step 2: In "Run SQL Scripts" call the procedure several times while changing the data library in the library list:

 

CL: CHGLIBL (ADVWORKS);

CALL DEV.ORDERS_SQL('2001-01-01-00.00.00','2005-01-01-00.00.00',676);

CALL DEV.ORDERS_SQL('2001-01-01-00.00.00','2005-01-01-00.00.00',670);

CL: CHGLIBL (ADVWORKS08);

CALL DEV.ORDERS_SQL('2002-01-01-00.00.00','2009-01-01-00.00.00',30070);

CL: CHGLIBL (ADVWORKSTS);

CALL DEV.ORDERS_SQL('2003-01-01-00.00.00','2004-05-01-00.00.00',670);

 

Step 3: Query the SYSPROGRAMSTSMTSTAT view and find out the rebuild count:

 

SELECT *

FROM QSYS2.SYSPROGRAMSTMTSTAT

WHERE PROGRAM_SCHEMA='DEV'

   AND PROGRAM_NAME='ORDERS_SQL'

   AND NUMBER_REBUILDS>0;

 

Assuming the procedure was called four times as shown in Step 2, the SYSPROGRAMSTMTSTAT view reveals the procedure has been executed four times and the plan was rebuilt three times (one for each library list change). Of course, examining the output of PRTSQLINF can be used as well to show when the access plan was last updated (although it doesn't give a count of how many times it was rebuilt.)

 

The abridged query results from Step 3 are shown here:

 

PROGRAM_NAME

NUMBER_TIMES_EXECUTED

NUMBER_REBUILDS

LAST_QDT_REBUILD_REASON

STATEMENT_TEXT

ORDERS_SQL

4

3

B6

DECLARE ORDERS CURSOR …

 

The last reason (B6) for the plan being rebuilt is:

 

B6 - The query was reoptimized because the value of a host variable changed and the access plan is no longer valid

 

The reason codes in column LAST_QDT_REBUILD_REASON explaining why the access plan was rebuilt can be found in this web page. (Interestingly, this B6 code is the given reason for the access plan rebuild even if the host variable values are the same across library sets.)

 

There are many other useful columns contained in this view (such as cumulative rows affected, module name, host variable count, etc.), so please check out the references at the end of this tip. To recap, SYSPROGRAMSTMTSTAT contains run-time information about all of the static prepared statements in all of the embedded SQL programs on the IBM i. That is extremely useful to a DBA or developer who is searching for queries that return large row sets or take a long time to run.

 

The same "rebuild access plan" demonstration can be made for SQL packages. Consider the following embedded SQL RPG code snippet that connects to a remote system (using DRDA) and runs a parameterized INSERT statement against one of three application data libraries. When this program (named PKGTESTR) is called, an SQL package object (*SQLPKG) will be created on the remote system (SYS520DEV). Unless overridden, the SQL package name and schema on the remote system will be the same as the program name and schema from the local system. The primary purpose of the remote SQL package is to store the access plan for various prepared SQL statements.

 

Exec SQL CONNECT TO SYS520DEV;

 

Exec SQL SET CONNECTION SYS520DEV;

 

StartStamp=%TimeStamp;

For i=1 To 2000;

   If %Rem(i:3)=0;

       Exec SQL CALL QCMDEXC ('CHGLIBL (ADVWORKS)');  

   ElseIf %Rem(i:2)=0;

       Exec SQL CALL QCMDEXC ('CHGLIBL (ADVWORKS08)');

   Else;

       Exec SQL CALL QCMDEXC ('CHGLIBL (ADVWORKSCP)');

   EndIf;

   Exec SQL

       INSERT INTO DATA (C1,C2)

       SELECT PRODUCTID,COUNT(*) NO_TRANS

           FROM TRANSACTIONHISTORY

           WHERE TRANSACTIONDATE BETWEEN :START AND :END

           GROUP BY PRODUCTID

           ORDER BY NO_TRANS DESC

           FETCH FIRST 3 ROWS ONLY;

   If SQLCode<*Zero;

       Dsply SQLCode '';

   EndIf;

EndFor;

EndStamp=%TimeStamp;

 

Exec SQL DISCONNECT SYS520DEV;

 

The INSERT statement is run 2000 times, executing alternately against one of three libraries. After running the program from the local system, the SYSPACKAGESTMTSTAT can be queried on the remote system to see how many rebuilds occurred.

 

SELECT *

FROM SYS520DEV.QSYS2.SYSPACKAGESTMTSTAT

WHERE PACKAGE_NAME='PKGTESTR'

AND NUMBER_REBUILDS>0;

 

The number of rebuilds in the package for the INSERT statement on the remote system shows 4000, a number higher than I expected. Perhaps it has something to do with the association between the SQL package and the embedded SQL program. The query against SYSPACKAGESTMTSTAT also showed the statement was executed 2024 times (instead of 2000), so there may be some extra things going on in the DRDA world that I don't understand. But the point is that the view reveals there are quite a few rebuilds happening.

 

Remember, there are legitimate reasons, such as a significant row count change in a table or a new index that can cause DB2 to rebuild a query plan. Therefore, having some rebuilds over time shouldn't be surprising. But a significant ratio of rebuilds to executions can reveal some areas where DB2 may be spending too much time on rebuilds.

 

A query against these new views can be used to find, for example, all packages and programs that have statements with greater than 10% rebuild:execution ratio:

 

SELECT PACKAGE_SCHEMA,PACKAGE_NAME,NUMBER_TIMES_EXECUTED,

       NUMBER_REBUILDS,STATEMENT_TEXT,LAST_QDT_REBUILD_REASON

FROM QSYS2.SYSPACKAGESTMTSTAT

WHERE CASE WHEN NUMBER_TIMES_EXECUTED=0 THEN 0 ELSE

                 DEC(NUMBER_REBUILDS,9,0)/NUMBER_TIMES_EXECUTED END>.1

   AND PACKAGE_SCHEMA<>'QSYS'

UNION ALL

SELECT PROGRAM_SCHEMA,PROGRAM_NAME,NUMBER_TIMES_EXECUTED,

       NUMBER_REBUILDS,STATEMENT_TEXT,LAST_QDT_REBUILD_REASON

FROM QSYS2.SYSPROGRAMSTMTSTAT

WHERE CASE WHEN NUMBER_TIMES_EXECUTED=0 THEN 0 ELSE

                 DEC(NUMBER_REBUILDS,9,0)/NUMBER_TIMES_EXECUTED END>.1

   AND PROGRAM_SCHEMA<>'QSYS';

 

In summary, it's customary to share a single program library over many data library sets. Unlike other database platforms, the library list concept available to DB2 for i is convenient and prevents unnecessary duplication of program code. However, there is a slight drawback to this approach because the changing of the library list can cause an access plan rebuild for prepared statements.

 

How much of an impact this has on a system depends on its horsepower (or CPW), the frequency of invocations of the trigger/function/procedure, the complexity of the statement(s) being rebuilt, and the frequency of library set switches. Unfortunately, I'm not familiar with any place in the DB2 world that indicates how much time DB2 spent on an access plan rebuild, so the actual cost is hard to quantify.

 

The rebuild-related columns in the new SYSPROGRAMSTMTSTAT and SYSPACKAGESTMTSTAT views can help identify which procedures and packages have SQL statements that are undergoing frequent rebuilds. If you suspect these rebuilds are causing a performance drag, potential solutions include distributing multiple copies of the embedded SQL program or, better, changing the program to dynamically prepare the SQL statement(s).

References

QSYS2.SYSPACKAGESTMTSTATIBM Documentation

QSYS2.SYSPROGRAMSTMTSTATIBM Documentation

 

 

Michael Sansoterra is a DBA for Broadway Systems in Grand Rapids, Michigan. He can be contacted at This email address is being protected from spambots. You need JavaScript enabled to view it..


BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: