|Those Weird and Wonderful Arrays|
|Programming - RPG|
|Written by Paul Tuohy|
|Tuesday, 20 May 2008 19:00|
Maximize the use of memory by using arrays.
Over the past few years, the style of my RPG programs has changed dramatically in many ways. One of the core changes is the way I use memory in programs. Historically, I would minimize the use of memory, but now I go to the opposite extreme and use memory as much as possible. For example, if I was writing an order entry program, instead of maintaining order details on the database as they are entered, I would store the complete order in memory and write it to the database when the order is complete.
RPG allows us to maximize the use of memory by defining very large field sizes and, more importantly, by using arrays.
In this article, I want to examine some of the many weird and wonderful features of defining and using arrays.
Regardless of how an array is defined, the amount of memory occupied by an array may not exceed 16 M (16,773,104). Up to V5R4, an array may have 32,767 elements, and the maximum size of an element is 64K (65,535). In V6R1, these two limits are increased to 16M (e.g., you can define an array of 16M one-byte elements or an array of two each of 8M one-byte elements).
Replacing Compile Time Arrays
I have never been a fan of compile-time arrays. I always hated having to page to the end of the program to view the data. Figure 1 shows an alternative to defining compile-time arrays. The compile data is placed in a named data structure, and the array overlays the data. This method has the enormous benefit of having the array data alongside the definition of the array. There are two interesting points to note: You can define storage in a data structure without providing a DS subfield name, and you can specify the name of the data structure on the Overlay keyword.
D CompileData DS
D 27a Inz('January February March ‘)
D 27a Inz('April May June ')
D 27a Inz('July August September')
D 27a Inz('October November December ')
D MonthNames 9a Overlay(CompileData) Dim(12)
Figure 1: An alternative to compile-time arrays
%SUBARR and SORTA
The %SUBARR (Set/Get portion of an array) built-in function (BIF) allows you to reference a subset of the elements in an array. %SUBARR provides a means of referring only to relevant elements in an array. This is particularly useful with the almost-forgotten SORTA (Sort Array) operation. Figure 2 shows the definition of an array (Codes); the Ascend keyword is required because the array will be sorted in ascending sequence (you would use Descend if you wanted the array sorted in descending sequence). The field NoOfElements contains the number of elements loaded in the array. The SORTA operation sorts just the loaded elements of the array into ascending sequence, making use of the %SUBARR BIF: The first parameter identifies the array (Codes), the second parameter identifies the starting element (1), and the third parameter identifies the number of elements (noOfElements) to process.
D Codes S 10a Dim(100) Ascend
D noOfElements S 10i 0
Figure 2: Using SORTA and %SUBARR
Arrays in Data Structures
Before looking at one of the eccentricities of defining arrays in a data structure, you should be aware that the maximum size of an array defined in a data structure is governed by the maximum size of a data structure (16M for an unnamed data structure or a named data structure in V6R1, 64K for a named data structure up to V5R4).
Figure 3 shows a strange feature of overlaying an array in a data structure. The Address array is defined as having 100 elements, each element being 87 characters in length. The rest of the data structure appears to define subfields that overlay the first element of the array (i.e., the combined length of the subfields is 87 characters). But such is not the case; since the subfields overlay an array, they are themselves arrays.
This means that any operation (such as SortA) executed on a subfield actually affects the overlaid (or base) array. In the example shown in Figure 3, the SortA of City results in the Address array being sorted based on the 61st to 80th character of each element. Or the SortA of State results in the Address array being sorted based on the 81st and 82nd character of each element.
D Address 87a Dim(100) Ascend
D Street1 30a Overlay(Address)
D Street2 30a Overlay(Address:*Next)
D City 20a Overlay(Address:*Next)
D State 2a Overlay(Address:*Next)
D Zip 5a Overlay(Address:*Next)
D noLoaded S 10i 0
SortA %SubArr(City:1:NoLoaded); // Sort into City sequence
SortA %SubArr(State:1:NoLoaded); // Sort in State sequence
Figure 3: Overlaying an array
You can use the feature of overlaying an array in a data structure to allow users to sort lists of non-volatile data by selected columns.
Figure 4 shows a snippet of the DDS for a display file containing a load-all subfile. The Return Cursor Location (RTNCSRLOC) keyword is used to determine which record (CSRREC) and field (CSRFLD) the cursor is positioned on when the screen is returned (i.e., the names of the screen format and the field are returned in CSRREC and CSRFLD).
A R SUBREC SFL
A 54 SFLNXTCHG
A OPTION 1A B 7 3
A 31 DSPATR(RI PC)
A CCODE R O 7 11
A ANAME R O 7 22
A REFNUM R O 7 54
A STATUS R O 7 64
A R SUBCTL SFLCTL(SUBREC)
A 51 SFLDSP
A 52 SFLDSPCTL
A 53 SFLCLR
A 51 SFLEND(*MORE)
A RTNCSRLOC(*RECNAME &CSRREC &CSRFLD)
A CSRREC 10 H
A CSRFLD 10 H
Figure 4: Relevant DDS for sorting a subfile
Figure 5 shows the relevant portions of the subfile program. The main points to note are below (numbers correspond to the numbers in the figure):
1. An externally defined data structure (SubRecData) is used to store an image of a subfile record. The field names used in the subfile correspond to those input from the database files (CCODE, ANAME, REFNUM, STATUS).
2. AllSubRec is a 9999-element (the maximum number of records in a subfile) array where each element is an image of the SubRecData data structure.
3. Subfields that correspond to the fields in the subfile record overlay the AllSubRec array, so each of these is, in turn, an array. Note that there are no hard-coded lengths or data types in this data structure; all of the components of the data structure are defined using the Like keyword, and the use of *Next in the Overlay keyword means you need only be concerned with the sequence of the fields in the subfile record.
4. All required records are read from the customer file and added to the AllSubRec array. When records are input, the data is placed in the CCODE, ANAME, REFNUM, and STATUS fields in the SubRecData data structure; the data structure is then copied to the next array element. At the end of the routine, the field RecordsInSubfile indicates the number of elements loaded in the array.
5. The subfile is loaded directly from the AllSubRec array, an element is moved to the SubRecData data structure, and the subfile record corresponding to the array element is written. The important point to note is that each element of the array is loaded to the corresponding subfile record.
6. Based on the value of the CSRFLD field set on the RTNCSRLOC keyword, the corresponding subfield is used to sort the AllSubRec array. The %SubArr BIF and the RecordsInSubfile field are used to ensure that only the loaded elements are sorted. Processing continues with the subfile being reloaded.
// Info for storing subfile data and sorting it
(1) D SubRecData E Ds ExtName(SUBSORTD:SUBREC)
(2) D AllSubRec Like(SubRecData)
D Dim(9999) Ascend
D ArrOption Like(Option) OverLay(AllSubRec)
(3) D ArrCode Like(CCode) OverLay(AllSubRec:*Next)
D ArrName Like(AName) OverLay(AllSubRec:*Next)
D ArrRefNum Like(RefNum)
D ArrStatus Like(Status)
RRN = 0;
Option = *Blanks;
SetLL *Start Cust;
Dow Not %EOF(Cust);
RRN += 1;
(4) AllSubRec(RRN) = SubRecData;
RecordsInSubfile = RRN;
(5) For RRN = 1 To RecordsInSubfile;
SubRecData = AllSubRec(RRN);
When CsrFld = 'CCODE';
When CsrFld = 'ANAME';
When CsrFld = 'REFNUM';
When CsrFld = 'STATUS';
Figure 5: Relevant RPG code for sorting a subfile
Although this technique is not suitable for volatile data, it does provide a very easy means to quickly re-sequence data for presentation purposes (a subfile in this case, but it could just as easily be for a Web page).
Data Structure Arrays
V5R1 saw the introduction of data structure arrays. At first glance, a data structure array appears to be nothing more than a slight enhancement on a multiple-occurrence data structure, but it is a lot more than that. Figure 6 shows the definition of a data structure array (ArrDs). The Dim keyword on the DS definition indicates there are 20 data structure elements and the data structure must be qualified. To reference an element of the data structure array, you specify the index after the data structure name and before the qualifier for the subfield. The data structure itself may contain an array, which means you now have two dimensional arrays, but it goes beyond that!
D ArrDs Ds Dim(20) Qualified
D Street 30a Dim(2)
D City 20a
D State 2a
D Zip 5a
ArrDs(1).Zip = ArrDs(2).Zip;
ArrDs(1).Street(1) = ArrDs(2).Street(2);
Figure 6: A data structure array
Figure 7 shows the definition of a data structure array (Customers) that contains two data structure arrays (Home and Business). The LIKEDS keyword is used to define a data structure within a data structure, each of these data structures being an array as well. You now have a three-dimensional array!
D Customers Ds Dim(100) Qualified
D Home LikeDS(ArrDs) Dim(5)
D Business LikeDS(ArrDs) Dim(5)
Customers(3).Home(2).Street(1) = Customer(4).Business(2).Street(1);
Figure 7: A data structure array within a data structure array
Yes, you can go to a fourth dimension if you wish. You are restricted only in that the outer data structure array may not exceed a total size of 16M (up to V5R4), and an individual data structure definition may not exceed 64K.
Sorting a Data Structure Array
Unfortunately, you may not use the SORTA operation to sort a data structure array, especially if you want to sort based on one of the subfields. Although you cannot sort the data in a data structure array directly, by using a pointer and defining another data structure, you can sort the data indirectly. Figure 8 shows the definition of the ArrDs data structure array containing 20 elements. This is followed by the definition of an unnamed data structure that is based on the pointer pArrDs, which is initialized to the address of ArrDs. In other words, the unnamed data structure overlays the ArrDs data structure array in memory. If you used a named data structure, you would be restricted to the 64K limit (up to V5R4).
The unnamed data structure contains an array (Address); each element is the length of an element of ArrDs, and it has the same number of elements as the ArrDs array. Because of the basing pointer, each element of the Address array overlays an element of the ArrDs array. The Address array is overlaid with subfields (which are in turn arrays themselves), and, as you saw earlier, these subfield arrays may be used to sort the Address array, which, in turn, means you are actually sorting the ArrDs array.
D ArrDs Ds Dim(20) Qualified
D Street 30a Dim(2)
D City 20a
D State 2a
D Zip 5a
D pArrDs S * Inz(%Addr(ArrDs))
D Ds Based(pArrDs)
D Address Dim(%Elem(ArrDs)) Ascend
D ArrStreet1 Like(ArrDs.Street) Overlay(Address)
D ArrStreet2 Like(ArrDs.Street)
D ArrCity Like(ArrDs.City)
D ArrState Like(ArrDs.State)
D ArrZip Like(ArrDs.Zip)
Figure 8: Defining a data structure to sort a data structure array
Breaking the 32,767 Limit
If you have a desire to use an array that exceeds 32,767 elements and you can't wait for V6R1, you can use a little bit of pointer math to bypass the limit. Say you have a requirement to store 100,000 elements in an array. Figure 9 shows the definition of an array (MyArr) containing 32,767 ten-byte elements. The array is based on a pointer (pMyArr). The %Alloc BIF is used to allocate the total storage (100,000 elements x 10) for the array; the use of the pMyArr pointer on the allocate means that MyArr overlays the first 327,670 bytes of the allocated memory. The For loop loops through all 100,000 elements. When a 32,767 RPG element limit is reached, pMyArr is changed to point to the next 32,767 elements and the Offset field is changed to ensure the index is kept within the 1 to 32,767 boundary.
D MyArr S 10 Dim(32767) Based(pMyArr)
pMyArr = %Alloc(1000000);
OffSet = 0;
for i = 1 to 100000;
j = i - OffSet;
If (j > %Elem(MyArr));
OffSet += %Elem(MyArr);
j = i - OffSet;
pMyArr = pMyArr + %Size(MyArr:*All);
MyArr = Value;
Figure 9: Breaking the 32767 barrier
Use the C functions qsort and bsearch if you need to sort or perform a lookup on one of these extremely large arrays; examples are available in the Redbook Who Knew You Could Do That with RPG IV?
We have a historical tendency to refrain from loading a lot of information into memory. When we break the habit, we discover there are a lot of neat things we can get our programs to do with little or no performance overhead. Arrays play a key role in maximizing how our programs utilize memory. And some of it is just plain fun!
|Last Updated on Wednesday, 21 May 2008 00:52|