Editor's Note: This week, we re-visit a couple of our readers' favorite past articles.
In a previous article, I talked about using the C language runtime function memset() to move a character repeatedly into a location in memory. In this issue, I'm going to show you how to take advantage of that function again, in conjunction with dynamically allocated memory. In fact, I'll go all the way to dynamic arrays by the time I'm finished.
When RPG IV was being designed, we heard a lot of requests for multidimensional arrays and for the ability to dynamically increase the number of elements in an array at runtime. Well, neither of these features made it into RPG. Today, the jury is still out on whether or not there's a real need for multidimensional arrays in RPG. The need for the ability to dynamically increase or decrease the number of elements allocated to an array is, however, blatantly obvious.
Of course, the simplest way to do this is to ask IBM to add RPG-like dynamic array size support to RPG IV. But, unfortunately, that does not help us today. So we have to simulate that kind of support using a combination of RPG technologies including pointers, based variables, and dynamic memory allocation.
Dynamic Memory Allocation
Normally, when a program starts, memory is allocated for all the fields used in the program, except procedures. RPG IV provides an additional feature in dynamic memory allocation.
Dynamic memory allocation is the allocation of memory to a program at runtime. The ALLOC (allocate) and REALLOC (re-allocate) operation codes perform memory allocation at runtime in RPG IV.
You might think the memory is returned in the form of a field to your program. Well, strictly speaking, it isn't. The ALLOC and REALLOC operation codes allocate the number of bytes of memory you tell them to allocate and return that memory via a memory address. This memory address identifies where the dynamic memory has been allocated on the system. The address itself must be stored in a pointer variable in your program in order to access the new memory. To access dynamically allocated memory, a based-variable is most often used in RPG IV.
Pointers are nothing more than a field type in RPG IV. They are fields that contain numbers. Those numbers are, however, interpreted as a memory address in OS/400. So a pointer field contains an address. Since OS/400 supports 286 trillion addresses, our pointer variables must be large enough to support relatively huge addresses. In fact, of the conventional operating systems currently available, OS/400 supports the largest pointers.
Pointers are normally referred to by the number of bits they occupy. You may remember the old PC DOS days when pointers were just 16 bits. That's 2 bytes. Windows 95 through Windows 2000 use a combination of 16-bit and 32-bit pointers. Windows XP uses 32-bit pointers exclusively and the follow-on to Windows XP will reportedly use both 32-bit and 64-bit pointers.
OS/400 has supported 64-bit pointers since was first shipped as CPF, back in 1981. So 20 years ago OS/400 had more advanced memory capabilities than today's latest and greatest Microsoft OS. But from 1981 through about 1995, OS/400 used only 48 bits of the available 64 bits. In the early 1990s, IBM started to rewrite much of the underlying operating system support (VMC) of OS/400 in C++. In doing so, OS/400 moved to a full 64-bit architecture, and there are hints that a 96-bit architecture based on 128-bit addressing would be a relatively simple achievement.
Okay, so who cares? Well, you should. The capability of OS/400 is far beyond that of most other operating systems. So if you have 4 GB of memory installed on your AS/400 or iSeries system today, that number may feel like 4 MB in three to five years. While IBM still can't figure out how to put a GUI on its most advanced operating system, you can rest assured it will continue to support the latest technologies.
In traditional RPG, when a field is declared, it is assigned enough storage to provide space for its data. This does not occur, however for a based variable. The definition specification keyword BASED identifies a variable as being a based variable. A based variable is not assigned storage by the compiler. Based variables are considered to have no storage.
Based variables are based on a pointer, meaning their storage is directly related to the address stored in the based pointer. So if a pointer contains an address of something like 12, the based variable has access to the memory starting at address 12. So a based variable acts like a view port into the memory stored in its based pointer.
This concept can be confusing. Here's an analogy that may help. Think of a database physical file. A physical file contains the physical data stored in the file. Now think of a database logical file. A logical file is simply a view of the physical data; it has no data of its own. That is essentially what a pointer and a based variable represent. The physical storage is identified by the pointer variable; the based variable is simply a view of the data at the member address stored in the pointer variable.
When an RPG program is called, the storage for the based variable is not allocated; hence, it makes no difference in performance or resource usage if a based variable is declared as 10 positions or 32767 positions. It is simply a view port to memory assigned somewhere else in the program.
Declaring a Pointer
Declaring a pointer variable in RPG IV is pretty easy; you simply declare a field and assign it a data type of "*" (asterisk). To assign a value to a pointer, use the %ADDR built-in function. The %ADDR built-in function returns the memory address of a regular field. For example, %ADDR(CUSTNAME) returns the address of the CUSTNAME field. To assign the address of CUSTNAME to a pointer field named PTR, used the RPG IV calculation specification shown in Figure 1.
Figure 1: Assigning an Address to a Pointer Field
At this point, the PTR field contains the address of the CUSTNAME field. So any field based on the PTR field can be used to access the memory of the CUSTNAME field. To declare a based variable over the PTR field, used the Definition specification shown in Figure 2.
Figure 2: Declaring a Pointer and Based Variables
Line 2 declares the pointer field named PTR. Line 3 declares a based-on variable named VIEWER. Note the length of VIEWER is only 10. This means that only 10 bytes are "viewable" at whatever address is stored in the PTR variable, when you view that data through the VIEWER variable.
Line 1 declares a regular field named CUSTNAME that contains 'Cozzi Consulting'. On line 2, the address of CUSTNAME is used as the initial value for the PTR field. As soon as the pointer is assigned, the data at the memory address stored in the pointer is accessible through the VIEWER variable. Hence, when the program begins, the content of each variable would be as follows:
CUSTNAME = 'Cozzi Consulting'
PTR = the address of the CUSTNAME field
VIEWER = 'Cozzi Cons'
Since VIEWER is only 10 positions in length, it can only "view" 10 positions of memory. Therefore, only the first 10 positions of CUSTNAME (in our example) are accessible through VIEWER.
Based variables are read/write variables. That is, they can be used just like other fields in RPG. You can move data to them and read data from them. Additionally, you may have more than one variable based over the same pointer.
Memory Allocation in RPG IV
To access dynamic memory in RPG IV, you need a pointer variable and an optional based variable. To allocate memory dynamically, an API is typically called; however, in RPG, the ALLOC operation code can be used to easily allocate memory. For example, to allocate 100 bytes of memory at runtime, use the calculation specification shown in Figure 3.
Figure 3: Allocating Memory Dynamically
Factor 2 of the ALLOC operation code indicates the number of bytes of memory to dynamically allocate. The address of the new memory is copied to the Result field. The Result field must contain a pointer variable; Factor 2 can contain a field, a named constant, or a literal value.
Once memory has been allocated, it can be manipulated. Frequently, programmers initialize the new memory before using it to store data. This ensures that no unexpected data is stored in the allocated memory space.
While OS/400 often returns memory as a series of blank characters, it is not guaranteed to do so. Therefore, you should initialize any new memory that is allocated. By simply moving *BLANKS to the based variables, you can accomplish initialization with minimal effort. However, if the size of the allocated memory does not match the declared length of the based variable, you may have a learning experience.
For example, if a based variable is declared with a length of 15 and 5 bytes of memory are allocated, the based variable will provide access to an extra 10 bytes that are not "owned" by your program. So using the MOVE or EVAL operations to move blanks to a based variable can cause problems.
In the last issue, I illustrated the Memset() function. Memset is a C runtime function that allows you to repeatedly copy a single character to a memory address. The prototype for memset is shown in Figure 4.
Figure 4: Prototype for the Memset() Function
Memset can be used on both conventional fields and dynamically allocated memory. Once dynamic memory has been allocated, use memset and specify the character to be moved into the new memory location along with the size of the memory (byte count). This will copy the specified character repeated to the memory location. Figure 5 shows an example.
Figure 5: Allocating and Initializing Dynamic Memory
The ALLOC operation (line 1) allocates 100 bytes of memory. The address of that 100 bytes is returned to the PTR field. Then the memset function (line 2) is used to copy hex 40s (i.e., blanks) to the new memory. The number 100 in the third parameter of the memset operation indicates the number of times the X'40' character is repeatedly moved. After the two operations in Figure 5 are performed, there are 100 bytes of memory allocated, and that memory is initialized to blanks.
Note: Although not a standard, using hexadecimal values for the second parameter of memset is common practice.
Now that you know how to allocate and initialize memory dynamically, apply that knowledge to dynamically allocating array elements.
In RPG IV, arrays are declared the same as any other field. The only difference is the addition of the DIM (dimension) keyword to indicate (a) that this field is actually an array, and (b) the number of elements for the array.
Normally, when an array of, say, 1,000 elements is declared, the program automatically allocates enough storage for all 1,000 elements. If each element is 10 positions in length, a 1,000-element array would require 10,000 bytes of memory, as shown in Figure 6.
Figure 6: Traditional Fixed-Element Array Declaration
To make this array have a variable number of elements, you must cause the memory to be allocated dynamically. To do that, you have to change something in the declaration so that the MYARRAY field is a based variable. That way, no storage is assigned to the field/array when the program is started, as the example in Figure 7 shows.
Figure 7: Array Declaration as a Based Variable
The storage for the array is no longer automatically allocated by the program at start-up time. Instead, the array becomes a view port to the memory at the address stored in pointer field named pArray. Therefore, if you increase the number of elements available to the array, no additional overhead is incurred. Figure 8 shows an example.
Figure 8: Large Element Count Array as a Based Variable
An array with 32,000 elements of 10 bytes each would occupy about 1/3 of a megabyte. But since the array is based on the pArray pointer, no storage is allocated; hence, its size has no adverse impact on the program's resources or overhead.
So how do you put the dynamic in dynamic array elements? You simply put together all the components I've talked about in this issue:
- Based variables
- Allocate operation
- Reallocate operation
Actually, it's pretty easy to create an array that supports a dynamic number of elements once you have all the pieces. The final step is in allocating and monitoring the number of elements desired. The easiest way to do that is with a numeric variable that contains the current element count.
Figure 9 contains a short excerpt of a program that utilizes a dynamic number of elements for an array. First, 100 elements are allocated, and then 7,500 elements are allocated. Field nELEMS maintains the number of elements available, and field nMEMSIZE controls the amount of memory required to allocate the nELEMS element count.
Figure 9: Dynamic Array Elements
Line 1 in Figure 9 declares a pointer variable. Interestingly, RPG does not require an explicit declaration of the pointer variable when the pointer is specified in the BASED keyword. But I would declare it just for completeness and to avoid confusion. Line 2 declares an array that is also a based variable; hence, no memory is allocated to this variable. It contains a DIM statement that allows RPG to address up to 32,000 array elements.
The rest of the variables are used to control the number of bytes of memory allocated and the number of array elements being used. You should never access memory outside of that which has been allocated. So use these types of fields to ensure that memory access violations don't occur in your code.
Line 7 performs a calculation that determines the number of bytes of memory needed to accommodate the desired number of array elements. This same calculation is performed again on line 12, after it is determined that the number of elements needs to change.
Lines 8 and 9 allocate enough memory for the number of elements needed and then initialize that memory. Then, on lines 12 through 14, a reallocation is performed. What would be the point of having a dynamic number of elements if you don't change the element count at some point during the program?
Line 12 recalculates the number of bytes of memory needed to accommodate the new element count. Then, on line 13, the REALLOC operation code allocates the new memory size.
The REALLOC operation works like the ALLOC operation except it allows an existing memory address to be given and virtually expanded or shrunk. That is, the memory size can grow or shrink using REALLOC by providing a pointer variable in the Result field. The new memory is allocated, and a new address is returned. To your program, it appears as though the memory allocation has grown or shrunk as requested.
Finally, the DEALLOC operation (line 16) returns the previously allocated memory back to the operation system. This is a critical operation. All allocated memory must be returned to the operating system using DEALLOC. Failure to deallocate previously allocated memory will result in memory leaks. And creating memory leaks qualifies you to write code for Microsoft Windows. :) Fortunately, when a job on the AS/400 ends, any memory leaks are returned to the operating system. Likewise, when the activation group in which the job was running ends, memory is also released to the operating system.
In conclusion, to dynamically allocate memory, you need to do the following:
- Declare a based variable that references a pointer field.
- Determine the number of bytes of memory needed.
- Use ALLOC to initially allocate the memory needed.
- Keep track of the memory.
- Use REALLOC to change the memory allocation size.
- Clear the memory using memset of another function.
- Always use DEALLOC to return the memory to the operating system.
Bob Cozzi has been programming in RPG since 1978. Since then, he has written many articles and several books, including The Modern RPG Language--the most widely used RPG reference manual in the world. Bob is also a very popular speaker at industry events such as RPG World and is the author of his own Web site and of the RPG ToolKit, an add-on library for RPG IV programmers.