I do a lot of Visual Basic (VB) and Active Server Pages (ASP) programs that talk to the AS/400, and I am constantly consulting with clients to help them improve the speed of their ASP and SQL AS/400 applications. Too often, I see people ignoring some of the built-in performance-enhancing capabilities of the AS/400 because they do not know about them.
So, this month, I will show you a technique for garnering statement reuse with your SQL queries and how to minimize communications with your AS/400 when executing SQL queries and AS/400 stored procedures. The technique is parameters!
Queries and the AS/400 Query Optimizer
To understand the impact of parameterized queries on performance, you must first understand the AS/400 query optimizer and an extremely bad programming technique I call SQL-mashing. Before an SQL statement is run, the query optimizer program looks at the SQL statement and determines what it thinks is the fastest way to get the requested data. This takes time. The SQL-mashing programming technique involves concatenating together string pieces in RAM to make a complete SQL query statement, and then passing the constructed query to the AS/400. When done over and over again, this is bad. I will walk you through an example of SQL-mashing and discuss its adverse effect on performance.
Imagine an order entry application with a parts master file. If a user types in a part ID, the application displays the part description. The application makes its SQL statement by mashing the part ID into the SQL statement string to create something that looks like the following:
SELECT PDESC FROM PARTS WHERE PARTID
When the AS/400 receives this statement, the AS/400 query optimizer evaluates the SQL statement and chooses the best way to access the data in the PARTS file. Now, consider if the operators next action is asking for part P32123-32. The SQL-mashed statement now looks something like this:
SELECT PDESC FROM PARTS WHERE PARTID
A human looking at the statements knows that they are both doing the equivalent operation, just with different key values. The AS/400 may not recognize that both statements are doing the same thing. It only recognizes that another statement is being executed and the statement has a different string length than previous statements, so it re- invokes the optimizer to find the access path. Optimization of SQL statements can take a long time and can cause a lot of processor overhead, especially if a lot of logical files are based on the physical file you are accessing. This is because each time the optimizer looks at an SQL query it must look at each logical file to determine which is the best access path to the data. This involves looking at the system catalogue files to determine what logicals are attached to the physical file(s) and their key order and the number of unique values in the each key. Looking at these catalogues and making these evaluations takes time. So it is best if a statement is optimized only one time and then reused. If you are SQL-mashing, your statements are not likely to be reused.
Reuse via Parameter Markers
Statement reuse comes in two forms: during job execution and through the use of SQL package files. When the AS/400 executes a job, it automatically keeps statement optimization information in RAM and tries to match incoming queries to queries it has already seen. This helps keep optimization down by attempting to optimize a statement only once and then reusing the access path information for subsequent execution of the statement. However, this feature can also byte you in the buttocks since it can lead to opening many access paths for statements accessing the same set of files. This happens if the AS/400 sees a statement that it has seen before but assumes it is different from the previous incarnations of the statement. The AS/400 re-optimizes it and keeps its access path lying around, leading to accumulating access paths taking up valuable RAM real estate in your computer.
The AS/400 packaging feature allows it to store optimization information about SQL statements in a special SQL Package file on the AS/400. This allows statement access path information to persist and be shared between instances of the application program. When a statement is executed from an application that uses packages, the AS/400 first looks at the statement and determines if it is cached in RAM. Then it looks at the package file to see if it has seen the statement in previous executions. If the statement is found, the AS/400 uses the stored access path and forgoes optimization; otherwise the statement is optimized and added to the package file. However, if you are using SQL-mashing in your app, you are causing multiple instances of the same query to be stored in the package files and wasting package space.
To effectively use either of these cool capabilities, you must use parameter markers in your queries. This is because the first order match of an SQL statement against previous statements in RAM and packages is based on the string length of the statement. A statement made by mashing has differing string lengths based on the keys mashed into the statement. A statement with parameter markers always has the same string length because parameter markers are just placeholders for the keys that are bound at statement execution.
Harkening back to my part ID example, even through the queries are the same, the AS/400 might not match them to previous queries, because the lengths of the query strings are different. Parameter markers allow the length of the query to remain constant, thereby letting the AS/400 clue in to the fact that you are running the same query so it does not need to be reoptimized.
When executing parameterized statements, the question mark (?) character is used as a parameter marker. Again, go back to the part ID example. To execute that query as a
parameterized statement you use the question mark in place of the part ID code. The statement would look like this:
SELECT P_DESC FROM PARTS WHERE PARTID = ?
OK, so now you know how to build a query that uses parameters. But how do you get the data into the parameter marker? This is accomplished using a technique called binding. The following example is a simple snippet of VB code that allows me to bind a part ID to the previous query:
CMD1.CommandText = SELECT P_DESC FROM PARTS WHERE PARTID=?
CMD1.Parameters(0).Value = P-123-45
SET RS = CMD1.Execute
The code assumes that CMD1 is an ADO command object that is associated with an ADO connection object connected to an AS/400. In the first line, the CommandText property of the command object is set to the statement I want to execute.
On the next line, I call the refresh method of the parameters collection associated with the command object. The parameters collection is where parameter information is stored so it can be bound to the query during execution. Each member of the parameters collection is a parameter object and represents one of the ? symbols in the query. Each parameter object has many properties and methods, but the most important is the value property. The value property is where you put the data that you want bound to the query at execution time. The Refresh method of the parameters collection causes the AS/400 to prepare the query, optimize if necessary, and then return the description of any parameters that the query might need. The returned information is used to instantiate one or more parameter objects in the parameters collection. The parameters are then used to marshal data between the client application and the AS/400.
The third line of code in the snippet sets the value property of the first parameter object, 0, in the parameters collection to hold the part ID code that I am interested in. On the fourth line, the execute method of the command object CMD1 causes the command to execute with the parameters currently in the parameters collection. The execute method returns a recordset causing the variable RS to be instantiated with the results of the query.
The beauty of using parameters is that if you want to look up another part before you close the command object, you just reset the value property of the parameter (0) to another part ID code and call the execute method again. This is slick. In addition, even if you instantiate a new command object and again set CommandText to the example parts query, the AS/400 recognizes that the SQL statement was seen before and does not reoptimize the statement. This saves time during the SQL statement preparation, cuts resource use and contention on your AS/400, and further increases the application speed.
Another Reason for Parameters
Here is another use for parameters. Imagine that you are doing an application that writes 10 transactions to a transaction file from a client. If you construct your insert queries using the SQL-mashing technique, all 10 queries might have to be reoptimized, taking a lot of processing time. Even if the AS/400 somehow magically sees that the statements are the same, it still has to go through the repreparation of the statements and look them up in the cache of previously seen statements. If you use a parameterized query, the statement is prepared only once, and you just rebind the parameters and call execute for each data set you want to write. The following snippet writes two records to an imaginary inventory transaction file using parameters:
Cmd1.commandtext = insert into invtrans (part_ID, quantity, transdate) values
Cmd1.parameters(0).value = P-125
cmd1.parameers(1).value = 12
cmd1.parameters(2).value = 01/01/1980
Cmd1.parameters(0).value = P-99-332-12
cmd1.parameers(1).value = 47.65
cmd1.parameters(2).value = 01/01/1980
Removing the Refresh
Another way to save execution time is to not call the parameters refresh method at all. This means that you, the programmer, must manually create the parameters collection for your query. Yes, manually specifying parameter information takes more typing, but you avoid the entire overhead of the refresh method. You see, the refresh method requires ADO to ask for a description of each parameter from the AS/400 about the statement just prepared. ADO does this one parameter at a time, not as a block request, so each parameter causes multiple packets to be sent from client to server and back again. For a statement with a small number of parameters, this traffic is usually negligible. However, I routinely work with statements that require 100 to 300 parameters. Calling the refresh method to describe these parameters automatically can take upwards of half a second and cause a lot of packet traffic between the client and the server.
To manually build the parameters collection, you must use the CreateParameter method of the Command object. Here is the syntax:
Set P = CMD1.CreateParameter(Name,DataType,Direction,Length,Value)
Name is a string that you can later use to reference the parameter in the parameters collection. Name is the only required argument; all others are optional. The data type is an enumerated constant value that tells ADO the data type the AS/400 expects this parameter to be. These enumerators are defined in the ADO documentation but can be accessed using constants like adChar to represent character data or adDecimal to represent a decimal number.
The parameter direction is an enumerator that specifies whether the parameter is an input parameter (adParamInput), an output parameter (adParamOutput), or omnidirectional (adParamInputOutput). Parameters to insert, update, select, and delete statements are always input parameters, whereas parameters to stored procedures can be input, output, or both.
The length is used to specify the maximum length of the parameter data and can be skipped if the parameter is not a character or binary string. Finally, the value argument can be specified to place a value into the parameters collection at the time of creation. The following snippet creates the parameter manually for the parts query and then uses the append method of the parameters collection to add the parameter to the parameters collection:
Cmd1.commandtext = select p_desc from parts where partid = ?
X = cmd1.createparameter(PID,adChar,adParamInput,10,P-152)
This call to CreateParameter makes a parameter with the name PID of data type char with a length of 10. The parameter is specified as an input parameter and a default value of
P-152 is instantiated. If you want to set the parameter to another value, the parameter value can be referenced by the name of the parameter, as follows:
CMD1.Parameters(PID).Value = New Value
The Height of Laziness
Cool Wizard Utility
If you havent noticed by now, I am the laziest programmer in the biz. In fact, I live by a motto: Good programmers write code; great programmers write tools that write code so they can waste time playing Quake and still look more productive than their peers. In pursuit of this motto, I introduce to you a small utility program called Insert Wizard that will help you create parameters for your queries and generate VBScript ASP code. The program allows you to pick a library, table, and then the target columns of the table. The program then generates an SQL IMSERT statement to write records to the table and writes all of the code you need in an ASP page to use the INSERT statement. In addition, it declares the parameters for you, allowing you to spend more quality time blasting your opponents with the rocket launcher.
Using the Utility
Download the Insert Wizard from the MC Web site (www.midrangecomputing.com/mc), and follow along as I show you how it works. First, open the project in Visual Basic and then select View, then Project Explorer. Then double-click on Form1 to bring it up. This form is where all of the action is.
Double-clicking the form brings up a code window showing the On_Load event. You need to change the data source name, user ID, and password arguments that are passed to the connection object during the call to the open method. In addition, this project uses ADO 2.1, so if you do not have it, go to www.microsoft.com/data and download the latest version. After the install, go to the VB Project menu, then the References menu, and select Microsoft ActiveX Data Objects 2.1 Library.
Press F5 to run the project. If you successfully connect to the AS/400, you should see a screen like the one shown in Figure 1 (page 93). Drop down the Available Libraries listbox and pick a library on your AS/400 system. Next, drop down the Tables in Library combo box and choose a table you want to make an INSERT statement for. Finally, select the columns you want in the INSERT statement from the Available Columns list. As you click them, they will appear in the Selected Columns list. To remove a column from Selected Columns, just click it to send it back to the Available List. Finally, when you have identified all of the columns for your statement, press GO. This writes the statement and places it in the Windows Clipboard. It can then be pasted into an editor like FrontPage or into the development environment you are using to create ASP pages. Figure 2 (page 93) shows a sample of the output created by the utility.
Where to Go from Here?
The wizard is a simple little program that you can extend to do UPDATE and DELETE statements also. In fact, the code is pulled from the freeware version of my SQLThing tool that does those functions already. In addition to the code for this utility, the download file contains a sample ASP page that uses code generated from the utility to insert records into a table. Play with the tool, look at the ASP page, and learn more about parameterized queries. They can help you eek that last ounce of performance out of your AS/400!
Figure 1: Identify tables for which you wish to generate SQL insert statement code in this screen.
set con1 = server.createobject(ADODB.Connection)
set cmd1 = server.createobject(ADODB.Command)
cmd1.ActiveConnection = con1
cmd1.commandtext = insert into SQLBOOK.WEBTEMP2 & _
(REQTYPE,REQFILE,BROWSER,REQTS,REQSIZE,REQUSER,GEOID) & _
Figure 2: This is a sample of code generated by the utility.