Brief: There are many ways to improve the performance of an AS/400 application. While some work better than others, few can match the impact of the technique discussed in this article. Find out how you can improve both interactive response time and batch throughput by making better use of your system's memory.
For the CPU to process data, the data must reside in main memory. Wouldn't it be great if, when you had enough memory available, you could force an entire file or program into memory, eliminating a substantial amount of disk activity? Since disk access is relatively slow in comparison to memory access, it could greatly improve application performance.
Suppose you have a program that reads every record in the customer master file and prints a report. Every time the program performs a read, the job checks main memory to see if the record it needs to process resides there. If it finds the record, processing continues. If not, the job submits a disk request to bring the data in from disk, and the job goes into a wait state until the data is placed into main memory.
Whenever data that needs to be processed does not reside in main memory, the job incurs additional CPU overhead, additional disk overhead and lots of waiting.
There is an AS/400 memory- management technique you may not be aware of that could help. You may not even have to modify a single line of code to implement it! Introduced in V2R2, the Set Object Access (SETOBJACC) command can place an entire file (*FILE) or program (*PGM) into a specific main storage pool. This process, which I'll refer to as datapooling, eliminates the subsequent disk requests ordinarily needed to read a file or program. The SETOBJACC command can greatly improve response times for your interactive jobs, cause your batch jobs to run in a fraction of the time they originally took, and decrease your overall CPU utilization and disk access.
Perhaps you attempted to use SETOBJACC before and saw little benefit from it. Although the command is easy to implement, you must take certain requirements into consideration to realize any benefits from it.
Beyond Record Blocking
Datapooling is similar to blocking, except that instead of bringing a select number of records into memory, a whole file or program is brought in. Even if you happen to read a file by key, you don't need to perform any disk requests. All data in the file is placed in memory.
Part of the reason this works is because data anywhere in main memory can be accessed by any job. Even if your job runs in the QINTER pool, it can access data in another pool. So it doesn't matter which pool initially contains the data.
SETOBJACC only benefits jobs that are reading or updating files-it has no effect on disk writes. Some examples of programs and files that make good datapool candidates include:
o A customer inquiry called from many different data-entry applications. o Files used in a subfile display. o Table-file data used in many different programs, such as a warehouse description file.
I'll provide some criteria later on to help you select objects to datapool.
The SETOBJACC command brings objects into memory; but before you can use this command, you need to do some preliminary processing and research.
Follow these three essential steps in order to take full advantage of datapooling:
1. Find or create a storage pool in main memory to hold the files and programs that you want to datapool.
2. Prepare the selected pool memory area for the files and objects.
3. Select the files and programs for datapooling. (This requires some research.)
Finding or Creating a Datapool Area
Since memory on the AS/400 is divided into pools, you need to specify an existing pool or create a new one for datapooling.
It's not always necessary to create a new storage pool. For example, nighttime batch processing can make use of existing storage pools as long as no interactive work is being performed. In this case, you can datapool objects directly into the existing interactive pool. Since no jobs will run in the interactive pool at night, your datapooled objects remain untouched. The interactive pool usually contains a great deal of memory which is otherwise unused during noninteractive hours.
If you're datapooling files and programs used in interactive and batch programs throughout the day, create a separate private pool. You shouldn't datapool these objects into the interactive or batch pool, because other jobs run in these pools. Those jobs might use the datapool object's memory pages, nullifying the effect of datapooling.
To illustrate the use of a private storage pool, create a new subsystem as follows:
CRTSBSD SBSD(QGPL/DATAPOOL) + POOLS((1 *BASE) (2 100 1))
You now have a subsystem called DATAPOOL. Storage pool 1 points to *BASE. Storage pool 2 (a private pool) will be used to datapool objects. It is initially set to 100KB. Only one activity level is assigned, since no jobs will be allowed to run in this pool.
You can start the subsystem manually with the Start Subsystem (STRSBS) command; but if it will be used every day, you should add it to your start-up program.
Next, you need to locate additional memory for this pool, which is currently set to 100KB. You'll need to take the memory for your new storage pool from one of the existing pools. Using the Work with System Status (WRKSYSSTS) command, you can view all pools in the system and change their memory allocations. Any pool in which the fault rate (DB and non-DB) is constantly less than 10 faults per second indicates the pool has too much memory allocated. Even if the faults for a particular pool appear high, you may be able to reallocate some memory. You can safely reduce the memory allocated to a pool if adding or subtracting memory doesn't affect the fault rate.
Memory removed from any pool is automatically added to the *BASE pool. You can now increase the DATAPOOL memory size by the amount of excess memory removed from the other pools plus any excess memory in *BASE.
Now you know the size of your pool, but it is still not ready to accept your objects.
Initializing the Datapool Area
In order to ensure that SETOBJACC works correctly, you must clear the memory allocated to your new pool of all existing data. Otherwise, the datapool objects could fail to reach main memory. To clear the datapool, issue the Clear Pool (CLRPOOL) command as follows:
CLRPOOL POOL(DATAPOOL 2)
This clears storage pool 2, which is the pool that contains memory for the datapool. You should run the CLRPOOL command whether you use a special pool or one of your existing pools (e.g., QINTER for nighttime processing). You won't harm your data-you're merely initializing space that once contained data.
At this point, you know which pools you can use for datapooling, how much memory you have to play with and how to initialize the memory-which leads to the next step.
Selecting Datapool Candidates
Finding datapool candidates for nighttime batch jobs is usually easy. Look for programs that read or update a large number of records in a file. A logical file and the physical file it is based on are both ideal datapool candidates, as is a physical file that is read by key. Also, look for files in long-running programs that are being read or updated. Investigate files that are read by many programs during the nightly process. These files are all excellent candidates for datapooling.
Finding datapool candidates for interactive jobs is a bit more difficult. Here are a few suggestions.
1. If you've worked on an application, you should have a good idea which files are used in nearly every program. These files are excellent candidates.
2. If you have the Performance Tools/400 (5738PT1), take a look at the Interactive Programs Statis-tics section of the transaction report. (To print this report, you need to specify that trace data be collected.) It lists heavily used programs and how many synchronous database accesses occur per transaction. Analyzing these programs will help you isolate files which are datapool candidates. Programs with many transactions could be datapool candidates themselves.
3. Files used in subfile programs are good candidates. Every time a subfile is loaded with records, the file or files containing the data are read many times. More often than not, these files are accessed by a logical file. You can make a big difference in response time for subfile programs by datapooling both the logical and physical files. Other potential candidates include programs which are called often within the subfile load. For instance, if you have a program that formats text before each subfile record is written, the called program is a candidate for datapooling.
4. Run a disk monitor to collect information about how often objects are accessed from disk. Frequently accessed objects are definitely objects you'd like to datapool.
Run the disk monitor during a busy part of the day, for a maximum of two to three minutes, and repeat the process several times for a true picture of the objects most often accessed from disk. The Performance Tools menu contains the option to start, end and print the disk collection. If you have V2R2 or a subsequent release of OS/400, you can also use the Retrieve Disk Information (RTVDSKINF) and the Print Disk Information (PRTDSKINF) commands to determine object usage.
Once you compile a list of datapool candidates, find out how large the objects are. You know the datapool size, so now you must ensure that the objects can fit in the datapool. You can obtain the object sizes with the Display Object Description (DSPOBJD) command. If an object is too large to fit into the datapool, you would not datapool it. For instance, you probably should not datapool a file that is 256MB.
If the datapool is too small to hold all of the datapool candidates, you need to dynamically datapool some of them. Make note of any datapool candidates used in just one program, and don't datapool them yet. You can do that dynamically, which I'll show you later. First, you want to data-pool as many of the heavily used objects as you can.
Performing the Datapool Operation
Finally, you are ready to assign objects to the datapool using SETOBJACC (1). You'll need to run this command for each object. In the next section, we'll discuss a CL program which does this automatically.
Finally, you are ready to assign objects to the datapool using SETOBJACC (Figure 1). You'll need to run this command for each object. In the next section, we'll discuss a CL program which does this automatically.
For each object, specify the qualified name of the object and *FILE or *PGM. If the object type is *FILE, the MBR option allows you to specify which file member to bring into the datapool. If you want all members in a multiple-member file to reside in the datapool, you need to perform a SETOBJACC for each member (member *ALL is not supported).
The MBRDATA option specifies whether you want the physical data, the access path, or both to be brought into the datapool. The default is both.
The storage pool parameter specifies where to put the datapool object. *JOB places the object in the pool the job is running out of; *BASE puts the datapooled object in *BASE; *SHRPOOLx (where x can be a value of 1-10) allows you to use shared pools *SHRPOOL1-*SHRPOOL10; and *PURGE removes the datapool object from memory (a useful option for dynamic datapooling). The second element of the storage pool parameter allows you to specify the subsystem pool ID to use for the datapool.
The storage pool option is where most people run into problems when attempting to datapool objects. For instance, suppose you want to use the interactive pool to datapool objects used in the nighttime process. To do this, you specify POOL(QINTER x), where x is the subsystem pool ID. Most people who realize no gains from the SETOBJACC command make the mistake of entering POOL(QINTER 1) for this parameter. However, on most AS/400s, subsystem pool 1 points to *BASE and memory is located in subsystem pool 2. You can use the Work with Subsystems (WRKSBS) command to see which subsystem pool number to use for a given subsystem.
A couple of notes about the SETOBJACC command are in order. Bringing objects into memory is very fast. SETOBJACC can easily bring in a 10MB file in just a few seconds. That means the datapooling of objects will take just a few seconds to execute.
If you run the command interactively, a message specifies how much of the object was brought in and how much unused space was available in the datapool to begin with. If the unused space in the datapool is not large enough to accommodate an entire object, one of two things can happen. The object might not be datapooled in its entirety, or the system might purge existing datapooled objects to make room for the current datapool request. The message allows you to determine if you are out of room in the datapool. This does not cause an error condition; it simply means that some of the objects might not be fully datapooled and you may not see the full benefit of datapooling.
Automating the Process
Now you're ready to write your CL programs to start datapooling. You'll use the first CL program, shown in 2, to datapool your candidates for daytime processing into the DATAPOOL subsystem created earlier. Let's assume you'll datapool three objects. The first is the physical file CUSMAS. The second is a logical file over CUSMAS called CUSMAS01. The third is a subfile inquiry program over the customer file called CMR200. All objects reside in library LIB.
Now you're ready to write your CL programs to start datapooling. You'll use the first CL program, shown in Figure 2, to datapool your candidates for daytime processing into the DATAPOOL subsystem created earlier. Let's assume you'll datapool three objects. The first is the physical file CUSMAS. The second is a logical file over CUSMAS called CUSMAS01. The third is a subfile inquiry program over the customer file called CMR200. All objects reside in library LIB.
The CL program in 2 would run every morning before your interactive work begins. You don't have to hold up interactive work, though; interactive jobs can run concurrently with this program.
The CL program in Figure 2 would run every morning before your interactive work begins. You don't have to hold up interactive work, though; interactive jobs can run concurrently with this program.
You'll need a similar CL program to set up the datapool before nighttime processing.
The next example shows how to datapool dynamically. Suppose CL program RPC100 (see 3) calls RPG program RPR100 which reads a file called REPMAS (representative master) during nighttime processing.
The next example shows how to datapool dynamically. Suppose CL program RPC100 (see Figure 3) calls RPG program RPR100 which reads a file called REPMAS (representative master) during nighttime processing.
The program in 3 brings REPMAS into memory right before we call the program. Our datapool in this example is QINTER, subsystem pool 2 (a shared pool). This is the only program that needs to use the file. We don't want to keep the REPMAS file in main memory after the program completes, so we purge it from memory by using the *PURGE option of SETOBJACC. This frees up memory for other objects to be brought in. If you use a shared pool for datapooling, be sure that expert cache is not active for that pool or you may not realize the benefit you expect. (Look for an article on expert cache in an upcoming issue.)
The program in Figure 3 brings REPMAS into memory right before we call the program. Our datapool in this example is QINTER, subsystem pool 2 (a shared pool). This is the only program that needs to use the file. We don't want to keep the REPMAS file in main memory after the program completes, so we purge it from memory by using the *PURGE option of SETOBJACC. This frees up memory for other objects to be brought in. If you use a shared pool for datapooling, be sure that expert cache is not active for that pool or you may not realize the benefit you expect. (Look for an article on expert cache in an upcoming issue.)
Take Advantage of Datapooling
Using datapools allows you to utilize memory more efficiently. With some basic analysis, you can determine which objects to keep in memory, because you know they're being accessed from disk over and over again. You free the processor from the overhead of submitting these disk requests. You free the disk system from accessing this data multiple times. And you reduce the overall response times for jobs that access these objects.
Datapooling provides a glimpse of future disk-to-memory management. In V2R3, we're able to use expert cache, which attempts to hold heavily accessed objects in memory so that they won't need to be accessed repetitively. And don't think memory management stops here. You'll see enhancements to the operating system in the near future that will allow the AS/400 to forecast what data will be needed from disk before the data is requested. These operating system enhancements will work with DASD enhancements to allow disk subsystems to keep pace with AS/400 processors.
Michael Catalani, president of Catalani Consulting, is an AS/400 consulting specialist and author of Performance Tuning Simplified, published by Midrange Computing. He can be reached on MC-BBS.
Improving Performance with SETOBJACC
Figure 1 The SETOBJACC Command
Set Object Access (SETOBJACC) Type choices, press Enter. Object . . . . . . . . . . . . . Name Library . . . . . . . . . . . *LIBL Name, *LIBL, *USRLIBL... Object type . . . . . . . . . . *FILE, *PGM Storage pool: Shared pool or subsystem name Name, *JOB, *BASE... Pool identifier . . . . . . . 1-10 Bottom F3=Exit F4=Prompt F5=Refresh F12=Cancel F13=How to use this display F24=More keys
Improving Performance with SETOBJACC
Figure 2 Daytime Processing CLPGM CLRPOOL POOL(DATAPOOL 2) SETOBJACC OBJ(LIB/CUSMAS) OBJTYPE(*FILE) POOL(DATAPOOL 2) SETOBJACC OBJ(LIB/CUSMAS01) OBJTYPE(*FILE) POOL(DATAPOOL 2) SETOBJACC OBJ(LIB/CMR200) OBJTYPE(*PGM) POOL(DATAPOOL 2) ENDPGM
Improving Performance with SETOBJACC
Figure 3 Dynamic Datapool Example (CL Program RPC100)PGM SETOBJACC OBJ(LIB/REPMAS) OBJTYPE(*FILE) POOL(QINTER 2) CALL PGM(RPR100) SETOBJACC OBJ(LIB/REPMAS) OBJTYPE(*FILE) POOL(*PURGE) ENDPGM