Arrays for Intelligent People Part 1

APIs
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

I like arrays. I always have. I don’t use them as often as I once did because RPG III and RPG IV have better ways of doing some of the things I used to do with arrays (like concatenation and substringing) and because DDS doesn’t support them. Even so, arrays are handy and are far from obsolete.

RPG of any flavor is a good language for working with one-dimensional arrays. One line of RPG is sometimes enough to accomplish a task that would require a loop in other languages.

In this article, I explain how to define arrays and discuss what operations you can do with arrays in RPG. I’ll cover two methods—the “new” method, used in RPG IV, and the “old” method, used in RPG II and RPG III. I’ll use RPG IV for my examples, however, since RPG IV has several features that RPG III doesn’t have (and never will have).

In a future article, I’ll discuss some of the ways you can put arrays to work, no matter what language you use.

Arrays Can Hold More than One Value

By default, a variable can hold only one value at a time. Such a variable is called a scalar variable. Storing a new value in a scalar variable wipes out the previous value.

An array variable can hold a list of values of the same data type and size. The values are distinguished by ordinal position within the list. That is, I can talk about the first value, or the fourth value, or the last value in the list. Each value in the list is called an element of the array and behaves like a scalar variable. The numeric value that indicates the element to be referenced is called an index or a subscript.

Many variables besides arrays can hold more than one value; these variables include tables, records, and data structures.

Tables differ from arrays in several ways: tables are not subscripted, table names always begin with the letters TAB, and an entire table may not be referenced in one

operation. (A table name references only the element last retrieved with a lookup operation.)

If I remember correctly, tables were part of RPG first, and arrays were added later. Anything you can do with a table, you can do with an array, so there’s no need to use tables anymore.

Records and data structures differ from arrays in that they may contain values of different types, and the values are distinguished by name, not by ordinal position.

Defining Arrays

To define an array in RPG IV, add the DIM (dimension) keyword to a D (definition) specification. The only argument DIM requires is the number of elements in the array.

In RPG III and RPG II, you must use an E (extension) specification to define an array.

Whichever RPG you use, you must give the compiler certain information about each array: the name of the array, the number of elements in the array, and the size of each element. Figure 1 shows you the new and old ways to define an array called SLS.

You may also need to specify other information, such as the number of elements to be loaded from each record of a file or a compile-time dataset and the sequence in which the elements are to be stored. Figure 2 shows you what is allowed and what is required for the three types of arrays and how to specify this information in both versions of RPG. You may find this table helpful as you read through the rest of this article. (The meaning of “run-time,”“prerun-time,” and “compile-time” used in Figure 2 will become clear in a moment.)

Indexes are specified differently in RPG IV than they are in RPG II or RPG III. In the “old” way to index an array, you code the array name, followed by a comma, followed by the index (which must be a numeric variable or constant).

In RPG IV, you place the index in parentheses after the array name. You may leave blanks between the array name and index if you like. Not surprisingly, RPG IV’s support for indexes is better. In RPG IV, the index may be a valid numeric expression that yields a positive integer value not greater than the number of elements in the array. For example, (X + 1) is a valid index (assuming there are at least X + 1 elements in the array, of course.)

Loading Arrays

An array element can get a value in three ways: in calculations, at compilation time, and from a file at program initialization time. If you assign an initial value to an array, you aren’t stuck with it. You can assign one value to an array element at compilation time or prerun-time and then change the value of that element in calculations. Nevertheless, arrays are classified by these three means of initialization, so I will discuss each of them in more detail.

Run-time Arrays

In run-time arrays, the values are assigned to array elements in the calculations of the program. The initial values are the same as for scalars, i.e., zeros for numeric arrays, blanks for character arrays, etc. In RPG IV, you can use the INZ keyword to assign an initial value for all elements of an array. The initial value may be anything you like, as long as it’s a valid value of the array’s data type. RPG III doesn’t have this ability.

The SLS array defined in Figure 1 is a run-time array. Run-time arrays are the only array type that can be defined in RPG IV subprocedures. Of course, a subprocedure can access compile-time or prerun-time arrays defined in the module of which it is a part.

Indexes

Compile-time Arrays

The initial values of a compile-time array are included in source code after all RPG specifications. Precede each compile-time array dataset with a line beginning with two asterisks and a blank. The compiler will ignore the rest of the line. On the following line enter the compile-time data.

In RPG II and RPG III, you must put the compile-time data in the same order as the arrays they reference are listed in the E specs. You can use this method in RPG IV programs too, but I recommend a different way: follow the two asterisks with the letters CTDATA, a blank, and the name of the array to be loaded.

As you can see in Figure 2, you must tell the compiler the number of elements to be found in each dataset record.

The StateAbbr array defined in Figure 3 is a compile-time array because of the presence of the CTDATA keyword. It has five elements in all. Each record of the dataset has values for five elements, so one record of data is enough.

Prerun-time Arrays

You can probably see one disadvantage to compile-time arrays: If you change the initial value of an element, you have to modify the source code and recompile the program.

Prerun-time arrays let you keep the array values in a separate file. At program initialization, the file is opened and the array is loaded. The file must be defined in the F specs with a T in the File Designation entry.

In Figure 4, StateAbbr is a prerun-time array. At program initialization, the system will load StateAbbr from one record of the STATES file.

You no longer have to recompile the program to change an element’s initial value, but you still to maintain the file containing the data.

Alternating Arrays

You can interleave compile-time or prerun-time array values if you wish. That is, the system will read the first element of the primary array, the first element of the alternate array, the second element of the primary array, and so on, from the array dataset.

Figure 5 contains RPG IV code to define and initialize alternating compile-time arrays of the states whose names have four occurrences of one vowel, and no other vowels. Notice that the first two characters of the compile-time data, AL, are the value for element one of StateAbbr. The next twelve characters, “Alabama”, are the value for element one of StateName.

The arrays will not be stored in alternating format in memory. All elements of any array will be stored adjacent to one another.

In the old method, the primary and alternate arrays are defined on the same E spec. Positions 27 through 45 are used to define the primary array. Positions 46 through 57 of the E spec are used to define the alternate array.

Noncontiguous Arrays

There is a way to store data of two or more arrays in alternating format in memory. That is, element 1 of the first array is followed by element 1 of the second array, which is followed by element 2 of the first array, and so on. To store in this alternating format, use noncontiguous arrays. Refer to Figure 6 as I explain how to do that.

First, define a base array as a subfield within a data structure. In this case, the base array is SalesData. Then define other subfields that overlay the base array. Here, those subfields are Region, NbrOfSales, and Revenue.

The OVERLAY keyword has two parameters—the name of the subfield to be overlaid, and the offsetting position. The second parameter defaults to 1.

Last, notice that the length of the base array element must be at least as long as the combined lengths of the noncontiguous array elements in bytes. Region has four-byte elements. Each element of NbrOfSales occupies three bytes, and Revenue has four-byte elements. Therefore, SalesData must have elements of eleven bytes or more.

Basic Array Operations

Let’s consider the operations specifically designed for arrays.

Table Lookup

One of the most common ways to use an array is to search a table. (In this context, table doesn’t mean the RPG feature similar to arrays, but a generic term for related columnar information.) This is the principle you use when you look up someone’s phone number in a telephone book. You search a name column until you find the name of the person you want to contact. When you find a match, you look in the column to the right to get that person’s phone number.

Figure 7 illustrates this process. Here you search for a state name, given a state abbreviation in field ABBR. The LOOKUP operation looks for an exact match, because the resulting indicator is in the “equal” position. It begins searching at element 1, because the index, x, is set to 1 before the lookup begins.

This is the most common type of table lookup, but you can also search for the first element that is greater than, or the first element less than, the argument in Factor 1. You can also start the search with some element other than the first, by setting the index to a value other than 1 before the lookup begins.

Of course, there are other ways to search a table. You can store a table in a database file instead. The advantage of using an array is speed. The disadvantage of using an array is that you may have to recompile a program when you change the table.

As a rule, use a file to store a table. But if the table is heavily accessed and relatively static, an array instead will deliver better performance.

You can use the SORTA op code to reorder the elements of an array. They will be sorted in either ascending or descending sequence, which you can specify in the defining E spec or by using the ASCEND or DESCEND keywords in the D spec.

Because blanks sort before all printable characters, you might want to initialize all elements of an array to *HIVAL when the program starts running. This way, SORTA places all unused array documents at the end of the array.

If you use SORTA with noncontiguous arrays, the base array is sorted in the sequence of the noncontiguous array. Look again at Figure 6. A calculation that says “SORTA NbrOfSales” will sort the entire SalesData array in ascending sequence by NbrOfSales.

Crossfoot

The Crossfoot (XFOOT) opcode adds all the elements of a numeric array together and stores the sum in the result field. This comes in handy when you’re using an array to sum up totals.

For example, assume a program reads records containing (among other things) the total amount of a customer invoice and the territory in which that sale was made. Valid territory codes are 1 through 7, so you might use an eight-element array to accumulate sales. The first seven elements will be for valid territory numbers, and the last element will be for invalid territory numbers, just in case a data entry error slips through or a territory number is not entered for an invoice.

Sorting

Figure 8 illustrates a portion of the code that would process that sales data. After all data has been processed, the XFOOT operation sums the territorial sales figures into the TotalSales variable.

Array-to-Array and Scalar-to-Array Operations

I really like RPG’s support for arrays, especially the way certain opcodes commonly used with scalar variables work when used with arrays.

If I apply a scalar value to an unscripted array name, all elements are affected. In Figure 9, for example, the MOVE changes the last two characters of each element to “**” and the EVAL changes all elements to “UNDEFINED”.

If I apply one array to another, the two arrays are matched by element. The matching begins with the first element and proceeds through the number of elements in the array with fewer elements. In Figure 10, the MOVE changes four elements of ARR03. The EVAL changes the first four elements of ARR04 and leaves the last four elements unchanged.

Figure 11 contains a list of some of the op codes that can be used with scalar or array variables.

The MOVEA (Move Array) op code is interesting. It can copy from a scalar to an array, from an array to a scalar, or from one array to another. It copies data character by character, from left to right, ignoring element sizes. Figure 12 illustrates how this op code works. I’ve included the contents of ARRAY05 and ARRAY06 before the MOVEA operation in comments. Then look at the comment following MOVEA.

But MOVEA has a couple of restrictions. First, in RPG III and RPG IV, you can’t use MOVEA with numeric fields. RPG II allows it, however.

Second, since the elements are not stored together in memory, the MOVEA cannot be used with a noncontiguous array (one defined with the OVERLAY keyword) in RPG
IV.

It Slices, It Dices...

Arrays are versatile. I’ve studied and used other types of data structures, including linked lists, sets, and graphs, and I find arrays the most useful of them all.

There’s more I could say about arrays, and I would have liked to have given more examples, but I’m out of room. I’ll talk more about arrays in a future issue. Now that I’ve showed how to define and manipulate arrays, I’ll concentrate on ways to put them to work.

Reference ILE RPG/400 Reference Version 3 (SC09-2077-01, CD-ROM QBJAQE01)

New way:
....+....1....+....2....+....3....+....4....+....5....+....6

D SLS S 9 2 DIM(4)
Old way:
....+....1....+....2....+....3....+....4....+....5....+....6

E SLS 4 9 2

Figure 1: Defining an array in the various flavors of RPG

Function RPG II/III columns RPG IV columns Run-time Prerun-time Compile-time or keywords
Array name 27-32 7-21 Required Required Required Number of elements 36-39 DIM Required Required Required

Element length 40-42 33-39 Required Required Required Decimal positions 44 41-42 Allowed Allowed Allowed Data format 43 EXTFMT Allowed Allowed From file 11-18 FROMFILE Required
To file 19-26 TOFILE Allowed Allowed Allowed Elements per record 33-35 PERRCD Required Required Sequence 45 ASCEND or DESCEND Allowed Allowed Allowed Alternating array 46-51 ALT Allowed Allowed Allowed Compile-time Not supported CTDATA Allowed dataset name
Initial value Not supported INZ Allowed

Figure 2: Array specification cross reference

DStateAbbr S 2 dim(5) perrcd(5) ctdata
** ctdata StateAbbr
ALGALAMSTN

Figure 3: A compile-time array

FStates if f 80 disk
DStateAbbr S 2 dim(5) perrcd(5) fromfile(States)

Figure 4: A prerun-time array

DStateAbbr S 2 dim(3) perrcd(2) ctdata

DStateName S 12 dim(%elem(StateAbbr))

D alt(StateAbbr)
** ctdata StateAbbr
ALAlabama MSMississippi
TNTennessee

Figure 5: Defining alternating arrays

D DS

D SalesData 11 dim(5)

D Region 4 overlay(SalesData)
D NbrOfSales 5p 0 overlay(SalesData: 5)
D Revenue 7p 2 overlay( SalesData: 8)

Figure 6: Defining noncontiguous arrays

BLOG COMMENTS POWERED BY DISQUS