23
Tue, Apr
1 New Articles

Random Access DB Performance Comparisons

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

I recently tested reading a table using a variety of languages and access methods (“Database Performance by the Numbers,” MC, January 2001) and generated a firestorm of email about the results. (See “Database Performance by the Numbers Feedback,” MC, March 2001 for my reply to some reader emails.) Well, I couldn’t leave well enough alone and decided to perform more tests! This month, I’ll show the results of several tests based on random positioning within a physical file. I’ll first test positioning; then the positioning and printing of the retrieved record; and finally the positioning, reading, and writing of the retrieved record to another table.

This time I had some help designing, writing, and running the test programs from my new partners in crime, Dan Nelson and Eric Merritt, who are consultants at McCormack & Associates, Inc., in Rock Hill, South Carolina (www.mcc-associates.com). Dan and Eric read my January article, ran all of the code on their AS/400 systems, and sent me the numbers. In addition, they offered to help me devise additional tests and to assist in the programming and evaluation of the results of those tests. Who was I to thwart the good intentions of such robust and erudite programmers? So here are the fruits of our efforts, along with a few rather surprising, and not so surprising, results.

A Few Words of Caution

Before I get too deeply involved in interpreting the results, I would like to talk about the intention of these tests. The main objective was to test programming language overhead, not how fast the AS/400 could read data from a disk by positioning its disk arms. Although most of these tests are not typical of real-world situations, they do, however, offer insight into language overhead on the AS/400 by looking at particular aspects of record manipulation and throughput. Download the code yourself at www.midrangecomputing.com/mc, test it, and take the results with a grain of salt, several slices of juicy red tomato, bacon, and mayonnaise richly slathered on two slices of tasty white bread. With that said, I’ll get on with our numbers and scrumptious results.

Random Lookup Test

Dan and Eric first contacted me about the record level access (RLA) and SQL numbers from the Java tests reported in my January article. They had tested RLA versus SQL from



Java about a year and a half ago and found that JDBC was much slower than RLA. The numbers in my article led them to conduct further tests, which yielded data that supported my findings, showing that IBM is indeed making progress in the performance of both SQL and RLA in Java.

They felt that further analysis should be done, and I heartily agreed, since the January tests were just a raw movement of data. What was needed were tests more suited to real-world activity and to the strengths of RLA. I wanted to test random-positioning, where I felt that RLA would be superior to SQL. At first, I wanted to test using the random number generator, but Dan felt we should use a predetermined list of keys so that each program would be on equal footing, and, therefore, the results of tests would be more repeatable.

Dan had the better idea, so he got to devise a set of test data and a key file. The result was the addition of a packed-decimal sequential number to the WEBTEMP table, used in the January article, and a file to be used as a source for the lookup values. In addition, he collapsed the Timestamp field into a Char(26) so we would not be including the overhead of the timestamp machinations, as this really skews the RLA numbers (see my March article). In addition, he added two fields to hold a header and a line-item number so we could later simulate reading orders from the file. The final file structure is shown in Figure 1.

For the random-positioning test, we decided that the number of lookups against the file should be 5,000, as this value would give us reasonable execution time in which to evaluate the language overhead. The keys were loaded into the lookup file in the following manner: 5000, 1, 4999, 2, 4998, 3, and so on. In this way, we felt somewhat assured that we would be testing random movement against the file.

The logic of the test is such that each program reads the key table into an array and then records the value of the clock. Next, the program opens the file or prepares its SQL statement and then goes into a loop, where it retrieves a record for each key in the array of keys. When the array is exhausted, the program takes a second clock value and determines the time spent between tick 1 and tick 2. We then calculate the number of lookups per second.

We ran the tests numerous times (at least 10 times each), then averaged the results. Figure 2 shows the records per second, or RPS, for each language and access method on both the Midrange Computing machine and the McCormack machine. Both machines are IBM Model 170s. MC’s machine has about 1 GB of RAM, and McCormack’s machine has
.5 GB of RAM. The Midrange Computing machine is on V4R5, and the McCormack machine is running V4R4, and the Java version used in all of the tests on all platforms is
1.2.2.

The Results

The first point of interest is that the C program with embedded SQL faired far worse than the RPG program with embedded SQL. C on the Midrange Computing machine allowed an average 2,000 RPS, whereas RPG allowed for 2,414 RPS. RPG using RLA performed as expected and handily beat embedded SQL by scoring 15,625 RPS on MC’s machine and 8,264 RPS on McCormack’s machine. This is quite good for RLA and random- positioning.

On MC’s machine, Java/JDBC with SQL scored 294 RPS, whereas Java with RLA scored 333 RPS. The Java Native Interface driver scored 384 RPS on the McCormack box and 454 RPS on the MC box. Although the numbers for Java/JDBC with SQL on the McCormack machine were similar to mine, our RLA times wildly differed. Java RLA scored 1,050 RPS on their machine and only 333 RPS on MC’s. Also, note the differences in the RPG RLA and RPG SQL results between the two machines. Those differences might have something to do with memory or pool configuration, but so far we are stumped as to the reason for the differences. As soon as we find the reason, we will publish it, as it



is obvious that it impacts Java RLA performance significantly. The only other obvious difference was the RAM size of the machines, but since mine is bigger (if that is impacting the test), it should favor me. Another difference is operating system version. My machine is on V4R5 and theirs is V4R4, so we will have to wait until they upgrade their machine and see if the performance of RLA goes away to test that theory. All I can say it that we are still looking for the answer and would appreciate any pointers or results from any of you that download and run the test code.

Could the Type of Key Be a Factor?

Since finding the timestamp and date performance problems in record level access, (see my March article), I have tried to be extra cautious when conducting performance tests, and I am always looking for extemporaneous effects. So to ensure that there was not a problem with packed-decimal keys, I changed the key to an integer and then recreated the logical files. I changed the programs, then recompiled and ran the tests many times.

Statistically, the numbers were identical, so packed-decimal keys and integers seem to yield an identical performance.

Another thing I wondered about was the effect of the logical file. Since this was a DDS logical file, you can use it in the FROM clause of the embedded SQL statement whereas if the file was an SQL index you could not do this. So I ran the program where the logical file was named as the target table and where the physical file was named as the target table. Again, there was no statistical difference in performance.

Finally, each of the values in the REQNBR field is unique, but the logical file and file definition do not reflect this. So I added a primary key constraint to the physical file on the REQNBR field. This shaved about 200 milliseconds off the processing time of the C SQL program but made almost no difference to the RPG SQL program.

Cross-platform Weirdness

Since Java is cross-platform, Eric ran the next tests from his Microsoft Windows NT 4.0 machine connecting to the AS/400 via JDBC. This machine was a 700 MHz Pentium III with 256 MB of RAM. The results were surprising: Java JDBC running on NT 4.0 clocked almost the same as it did running on the AS/400. Yep, when doing the test with SQL, it did not matter if the code ran on the AS/400 or on NT 4.0; you got the same records per second. In fact, NT 4.0 edged out the AS/400 in this test by a whopping 4.9 RPS.

Figure 3 (page 77) shows a set of 10 runs on both NT 4.0 and the AS/400 doing the random-position test with JDBC. The column labeled INI (initialization) refers to the time spent preparing the SQL statement and is expressed in seconds. The column labeled Read is a measure in fractional seconds of the time it took to randomly position and read 5,000 record using SQL. What I find fascinating is that the machine running NT 4.0 also consistently prepared the SQL statement faster than the AS/400. (This might be an effect of packaging, but the JDBC drivers on the client NT 4.0 machine and on the AS/400 were both set to use packages.) Remember that when the program is running on NT, it must send the statement to the AS/400 using TCP/IP and wait for the AS/400 to prepare the statement and send an acknowledgement. The network latency should have caused the Java program to take more time on the NT machine, because there should be no network latency inside the AS/400. However, what we may have here is an example of MHz matters. See, since Java is garbage collecting and moving objects all around in RAM, the faster your processor and RAM bus are, the faster the Java can process. Eric and I are currently designing more tests so we can properly measure the effect of MHz and RAM speed on Java programs. We will let you know what we find, but in the interim, please send any other suggestions that you may have.

As for the RLA code running on Eric’s NT machine, it consistently clocked at half the speed of the AS/400, which is what we were expecting the JDBC code to do.



Printing Overhead

Because the tests so far have read only randomly from a table and are not “real world” tests, I conducted further tests based on a small modification to the RPG program that lets it print a report about each record it reads. Figure 4 shows the results of the printing tests.

What I found most interesting about these tests is that once printing was added, the records per second dropped very little for the SQL programs, but they dropped by almost half for the RLA-based programs. This could indicate that some optimization was done in the RLA programs that was not done in the SQL programs. Note that in the first tests, the programs did not actually do anything with the data; they just found the records. But the SQL programs were pulling the data into a structure. I could argue that the SQL program is doing more work. It is also possible, however, that the data was available to the RLA program but that the RLA program just didn’t use it. By adding printing, if the data was not getting moved, that optimization disappeared because the program was now having to access the data to send it to the printer, which cut performance. Again, this is only conjecture, and I would love to have one of you MI heads look at the compiler output and tell all of us what is really happening.

Readin’ and Writin’

The next tests were based on another modification to the RPG random-positioning program (the results of which are shown in Figure 5). In these tests, we were loading keys from the RANKEY2 table and then using the RANDATA02 logical file to find records by the REQHDR field. Once a record is found, the program reads the rest of the records where the REQHDR field matches (it happens to always be 10 records in this test). As it reads each record, it writes the record to the file RANDOUT, which is a mirror image of RANDATA. The key seed file contains 500 records, and since there are 10 details for each key, the program is reading and writing 5,000 records in Toto (a small dog from Kansas). The program takes a timer tick after the array of keys has been loaded, before the preparation of the select and insert statements or before the opening of any files. The program takes a second tick when the array is exhausted, computes the difference to get a total time elapsed, and then calculates seconds elapsed to perform the test.

Note that RLA triumphed once again, clocking in at .575 seconds, whereas SQL took 1.524 seconds to accomplish the task. Again, there was a difference between the average time that the McCormack box performed using RLA and SQL in RPG versus those that the MC box performed. I still think this has to do with the amount of available memory but welcome your suggestions and tuning tips.

The really interesting result is when the Java JDBC programs are executed from an NT environment; Java JDBC from an NT 4.0 client is faster than Java JDBC and Java Native Interface running on the AS/400. Personally, I think this has more to do with the clock speed of the box than with the access method. Again, network bandwidth overhead should make this slower, but the added clock speed of the PC might make up for the latency of the network by processing the instructions and garbage collection faster. Also note that Java RLA on an NT client takes double the time of Java RLA running on the AS/400. Here, I think that the network latency is kicking in and giving results more like one would expect.

Finally, I cheated a little on the last number in Figure 5, RPG SQL Bulk. You see, there is a feature in SQL called blocked inserts. I set up another copy of the program to do a blocked fetch of the 10 records that match the key into an RPG occurs data structure. I then bound the structure to an insert statement with the following syntax:

INSERT INTO RANDOM/RANDOUT 10 ROWS
VALUES (:reco :indary)



The 10 ROWS indicates that I will be sending 10 records to the statement in my data structure. The :reco and :indary are occurs structures that represent my set of data and my indicator array. Blocked inserts and selects are cool ways to get extra performance out of SQL, so you should try to put them in your programs (if you are using embedded SQL, that is). I consider this cheating because there is no program logic to handle less or more than 10 rows that match a key. In the real world, I would never always have 10 details, so I would have to get a little fancier with my code to use this technique. However, the technique is valid, and I wanted to share an example of the blocked insert technique in the downloadable code.

Final Frontier Thoughts

As in the last article, these tests are just simple programs that really don’t do a heck of a lot. Yes, they are not real-world programs; they are merely attempts to set up artificial, arbitrary circumstances so that the overhead of a programming language and access method can be evaluated. Yes, the data is mostly in cache when it is running, as I don’t care about factoring disk arm movements; I only care about overhead. The idea behind this is to get some basic metrics on how languages and methods perform doing simple tasks so that you can pick the appropriate tool for the job at hand or know the overhead of the tool that you are using.

I again invite you to download the code and data used in these tests and run them on your own hardware, in your own environment. If you find a way to make the program faster, by all means send it to me for a future performance article. There is a paltry amount of information on performance in the IBM documentation, and a lot of it requires a Ph.D. to understand. If we, the AS/400 community, start to share and pool our knowledge, tips, and tricks, we can eek every last cycle from our machines.

And, finally, a special thanks to Dan and Eric for all of their assistance with this article and for their tireless running and rerunning of tests. Without them, there is no way I could have written this text.

A*

A* WEBTEMP TEST

A R RANDATAR

A REQHDR 8S 0

A REQDTL 8S 0

A REQNBR 8S 0

A REQTYPE 10A

A REQFILE 80A

A BROWSER 10A

A REQTS 26A

A REQSIZE 11A

A REQUSER 11A

A GEOID 6A

A* K REQHDR

A* K REQDTL

Figure 1: This DDS spec of the RANDATA file was used in all of the tests.



MC’s McCormack’s Model 170 Model 170
RPG RLA 15,625 8,264 RPG SQL 2,415 1,461
C SQL 2,000
Java RLA 333 1,050 Java JDBC 294 294 Java Native 454 384 Java on NT JDBC 299 Java on NT RLA 500

Figure 2: These are the RPS measurements of the various languages and access methods used in the random-position test.

NT 4.0 AS/400 INI Read INI Read

.141 17.094 .633 17.490
.151 16.383 .633 17.490
.160 16.253 .419 17.883
.140 16.754 .515 17.596
.131 17.324 .587 17.471
.150 16.094 .715 17.520
.160 17.195 .389 17.617
.170 16.314 .666 17.537
.161 19.367 .506 17.604
.141 17.995 .309 17.577

Figure 3: This is a log of the statement preparation time versus the time to process 5,000 records on NT and the AS/400.

MC’s McCormack’s Model 170 Model 170
RPG SQL 2,083 1,276
RPG RLA 8,928 4,812

Figure 4: These results show the RPS for the random read/print test.



MC’s McCormack’s Model 170 Model 170
RPG SQL 1.524 3.253 RPG RLA .575 .967 Java JDBC 39.062 26.178 Java Native 25.002 24.038 Java RLA 18.050 6.666
NT 4.0 Client

Java JDBC 16.025
Java RLA 16.286 RPG SQL Bulk .543

Figure 5: These results show the RPS for the random read/write test.



BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: