The Copy Commands
by Craig Pelkie
A great deal of programming involves selecting and copying data from one file to another. Most operating systems provide a copy command that lets you make a copy of the entire file. Some systems, such as the System/36, allow you to perform rudimentary selection during the copy. Few systems feature a copy command as powerful as the AS/400 does.
If you are familiar with the S/36, you know that COPYDATA is very limited in its function. There are two major drawbacks - you can specify only one selection criterion, and you cannot use COPYDATA to add to an existing file.
When COPYDATA does not provide the functions we need, we can use either the S/36 sort or write short programs to solve the problem. I have used both techniques extensively, but more often I use sorts rather than programs. That way, I can make changes in OCL, rather than having to recompile RPG programs. Either of these techniques let us select and add to existing files, but I have always preferred using a single procedure, rather than the COPYDATA/sort/program triumvirate.
The CPYF Command
The AS/400 offers the CPYF (Copy File) command. This command provides many more options than COPYDATA, and in most cases can be used instead of a sort or custom program. In addition to copying disk file to disk file, you also use CPYF to copy to the printer (to perform "dumps"), and you can use CPYF or one of its variants (discussed below) to copy to and from diskettes and tape, and to copy source files.
If you have access to the CL Reference manuals, you should turn to the section on the CPYF command and study the command syntax diagram. Although this may look like a very complicated command, you will see that most of the parameters are in functional sets. You select the parameters based upon the function you need.
As the diagram shows, the simplest CPYF that you can issue requires only two parameters: the FROMFILE and the TOFILE. However, if you try to use CPYF with just those two parameters when copying disk files, you encounter the first exception right away. Because CPYF lets you add to an existing file, you must specify whether you want to add records or replace records already in the file. So for a disk copy, you need to specify the MBROPT (Member Option) parameter as *ADD or *REPLACE.
Another option you can use with disk copies involves specifying that a new file is to be created with the CRTFILE (Create File) parameter. This parameter has *NO or *YES selections, with *NO as the default. If you specify CRTFILE(*YES) then you don't have to specify a MBROPT, since *ADD is assumed. However, if you do not specify a CRTFILE option, or use *NO, then you must specify a MRBOPT selection. Confusing? You should probably run CPYF a few times with some test files, and try different combinations of these parameters. One point you must keep in mind with the TOFILE parameter is that the file you copy to must be a device file (printer, diskette or tape) or a physical file. You cannot copy to a logical file. However, to get the effect of a sort operation, you can copy into a keyed physical file, or you can simply create a new logical file over the physical file when CPYF is done. I mention "create a new logical file" because you will probably find that this operation goes more quickly if you copy into a physical file that has no logical files over it and then create the logicals, rather than copy into a physical file that has logicals over it. You will probably notice a difference in performance only when copying great numbers of records - perhaps 100,000 or more. Such extensive copying is (presumably) being done as a batch job, so it may be feasible to delete any existing logicals and then recreate them when the copy is done. This may be complicated if you have many logical files over the physical file into which you copy, so you must study your situation quite carefully before deciding which approach to use.
In any case, if you are copying from or to a multiple member file, then you must pay attention to FROMMBR and TOMBR (From Member and To Member). The default is to copy from the first member to the first member. The "first" member is not necessarily the first in the collating sequence, but rather, it is the first member that was added to the file. If you are using multiple member names, then you will usually want to specify the member name explicitly.
Another option with FROMMBR/TOMBR is to copy many members in one operation. On the FROMMBR parameter, you can specify either *ALL (to copy all members) or enter a generic name. For the TOMBR parameter, you can specify TOMBR(*FROMMBR), meaning that the copy is from a member to a corresponding member in the TOFILE. If you copy from a member and the TOFILE does not already contain a member of the same name, the system adds the member to the TOFILE. Finally, you can use the FROMMBR/TOMBR parameter to copy from many members into one member in the TOFILE. To do this, you specify FROMMBR(*ALL), or a generic name, and simply name a specific TOMBR.
Selecting By Record
One of the simplest CPYF selections you can use is to select the from and through records. Closely allied with FROMRCD and TORCD is NBRRCDS (Number of Records). The defaults on these parameters are FROMRCD(*START), to start at the beginning of the file; TORCD(*END), to copy through to the end; and NBRRCDS (*END), to copy all records. I have used these parameters when I simply wanted to dump some records in the file, and did not particularly care about any other selection criteria. For example, to dump one hundred records from the file, I might specify FROMRCD(101) NBRRCDS(100), or alternatively FROMRCD(101) TORCD(200). This is a quick, simple way to select a subset of your file when you need some records for testing or examination.
Like a S/36
The CPYF command includes a parameter that is similar to the S/36 COPYDATA. With INCCHAR (Include character), you can select records based upon a character string. This parameter works the same as the S/36 parameter: specify the starting position and the character string that you want to include by. The character string can be up to 256 characters long.
Because of externally defined files, you can also specify that the character comparison is to take place within a specific field. If you specify a field name, the comparison start position is the offset within that field.
I do not know why, but you can use only one INCCHAR selection per CPYF. So without using another selection technique, this is still similar to the S/36 limitation.
If you are using externally defined files, you can make use of the INCREL (Include Relationships) selection parameter. With this parameter, you can specify up to fifty conditions for your selections. You can specify AND/OR logic, and you use the relationship operators (equal, not equal, and so on) to specify each condition.
To use INCREL, you specify the first "include set" with the *IF prefix. For example, specify INCREL((*IF DELETE *NE 'D')). In that example, we specified that records are to be included if the field named DELETE is not equal to the letter "D". The double parentheses are used, since the INCREL parameter is a "list within a list" parameter.
You specify additional conditions with the logical conjunctives. For example, we can add another selection to our test, like this: INCREL((*IF DELETE *NE 'D')(*AND CURBAL *GE 500.00) You can continue on, up to the maximum number of conditions.
INCREL can become quite complicated, especially if you have many conditions. I virtually always use command prompting when entering an INCREL set. When you use the prompter, you can use Command 14 to see the "command string." The command string shows you what the command looks like so far, just as if you had typed it without prompting. I like to use Command 14 when entering a long list of conditions so that I can review what I have entered so far, checking that I haven't missed any.
Copying By Key
One of the most complicated but potentially most powerful selections that you can use is the FROMKEY/TOKEY set. This lets you specify a starting and ending key. In many cases, you can get the same results with the simpler INCREL selection, but there may be times when you need to select by key. To use this parameter, the FROMFILE must be a keyed file. Also, you cannot use the FROMRCD/TORCD parameter set with FROMKEY/TOKEY.
There are two methods that you can use to specify From and To keys, neither of them particularly intuitive. In the first method, you specify first the number of key values, then a character string representing the key with which you want to begin. This technique is not so bad if your key fields are alphanumeric or unpacked, but if you use packed or binary fields in your key, then you may have to specify the key in hexadecimal format. Remember that AS/400 numeric fields default to packed, so it is probable that any numeric keys you have are packed. (Since there are very few reasons to do so, you should not override this default when describing your external files. The problems associated with this CPYF parameter are certainly not a reason to change the default.) For example, to select a single key alphanumeric key field, you might specify it as FROMKEY(1 'MIDRANGE '), to start at the keys beginning with the character string "MIDRANGE." However, to select by keys for customer number 028179 in a packed numeric field, you would have to do something like this: FROMKEY(1 X'0028179F'). That is hexadecimal format; the leading "X" indicates hex and the trailing "F" gives the sign.
The other way of specifying a key selection is with the *BLDKEY (Build Key) option. This lets you specify up to fifty key fields, and tells CPYF to take care of any necessary formatting. For example, to select our customer as in the example above, we could specify FROMKEY(*BLDKEY 028179).
When you specify a FROMKEY, you can also specify a TOKEY. The TOKEY parameter follows the same rules as FROMKEY.
I think that I have used FROMKEY/TOKEY only a handful of times over ten years with the CPYF command. I have always been able to use INCREL to select the records I want. However, you may want to keep this selection set in mind, since it works more quickly than INCREL, especially if the file is large. That is because the system can read directly by key, rather than having to read every record and determine whether it meets the INCREL selections.
I make frequent use of CPYF for dumping files to the printer. I realize that this may be a rather unseemly technique, in this day of screen browsers and queries, but sometimes I like to be able to work with several pages of hex dump. I have tracked down more than my share of bugs with hex dumps, and usually get the dump first, rather than as a last resort.
To copy to the printer, you specify TOFILE(*PRINT). There is an additional parameter, OUTFMT (Output Format), that you can use. This is specified as either *CHAR or *HEX, for character output or hexadecimal output. The default, *CHAR, is actually useful only when the file is all or mostly text data. For purposes of examining data, I always specify the *HEX format, which prints each byte of the file on three lines: the first line is the EBCDIC character (if there is any), and immediately below are two lines for the first and second hexadecimal characters.
You should be careful when using the TOFILE(*PRINT) selection, especially if the file is sizeable. You might end up generating a great deal of waste paper. For big files, I usually also use INCREL, or just specify a FROMRCD/TORCD set.
One of the greatest features of the CPYF command is the FMTOPT (Format Option) parameter. To get the maximum benefit from this, you must be working with external files. If this feature doesn't motivate you to convert to external, I don't know what will. The big attraction with FMTOPT is that you can copy from an old version of a file to a newer version, preserving data in the correct fields. So what? Well, you will use this when you need to add or rearrange or remove fields from your file definition, or even change the fields (anybody ever had to expand their customer number?). In days of old, doing this type of thing meant writing a "one-timer."
With FMTOPT, you can specify that fields are to be "mapped" or "dropped," with the *MAP and *DROP selections. *MAP means that like-named fields are to be copied to each other. The fields do not have to be in the same absolute or relative positions in each file; the system simply looks for the same names. If you have added one or more fields to the TOFILE, those fields are properly initialized. Specifically, numerical fields are set to valid numeric values, not to blanks. If you have removed one or more fields in the TOFILE, you specify *DROP. That tells the CPYF command to accept the copy, and not to stop because the same fields are not present in both files.
Another FMTOPT selection is *NOCHK. This specifies that data is simply copied from one file to another, by position. I have frequently used this to convert a program-described file to an externally described file. To do that, you must be certain that you have accounted for all positions in both fields, and that your data is "clean," meaning that all numeric fields are properly set. You can use this in stages to create an external file: make your first pass with *NOCHK to copy into a "work" externally described file, then make another pass from the work file to your final file, with *MAP and *DROP.
Finally, you can use FMTOPT to replace the S/36 library commands, FROMLIBR and TOLIBR. The selection is FMTOPT(*CVTSRC) to convert from source files to data base files, and back. When you use *CVTSRC, you may also use the SRCOPT (Source Option) and SRCSEQ (Source Sequence) parameters: those specify that you want to renumber or reset the date on source records.
Other CPYF Parameters
There are a few other CPYF parameters that I will review here. I have rarely used these, but you should be aware of them. PRINT lets you print copied or excluded records. For example, you might be copying disk to disk, and want to keep a listing of records that were not included.
If you are copying from a multiple format logical file, you can use RCDFMT (Record Format) to specify which record format to use. This can be *ALL or the name of a specific format. I believe that if I were to copy only a specific format, I would probably copy from the underlying physical file instead.
I have rarely used ERRLVL (Error Level). The book says that this "specifies the maximum number of recoverable read or write errors...tolerated." This can be used to drop records with duplicate keys when copying from a file that allows duplicate keys to a file that requires unique keys. If you use ERRLVL(*NOMAX), only the first record with a duplicate key will be copied to the TOFILE.
If you are using sequential or direct files, and it is important to you that the records remain in the same sequence, then you will want to use the COMPRESS parameter. The default for this is COMPRESS(*YES), meaning the deleted records are not included in the copy. By "deleted records," we mean records that the operating system has deleted, not those in which you put a delete code. In RPG, you can delete records with the DELET opcode, or with the DEL operation on the O-specs. If you simply want to remove deleted records from a file without copying, use the RGZPFM (Reorganize Physical File Member) command. You may want to pay attention to your highly volatile files, since the space taken up by the deleted records is not released until the file is reorganized.
There are other forms of the CPYF command that you may have occasion to use. For the most part, these commands provide the same function as plain CPYF, but they are specific to the situation.
For example, there are CPYFRMDKT (Copy from Diskette) and CPYTODKT (Copy to Diskette) commands; also CPYFRMTAP (Copy from Tape) and CPYTOTAP (Copy to Tape). Compared to CPYF, these have far fewer parameters. What you give up are the selection sets used in CPYF, so if you need to selectively copy, you will have to use CPYF itself, or CPYF first to a work file. The problem with using plain CPYF to copy to a diskette or tape is that you may have to use an OVRDKTF (Override Diskette File) or OVRTAPF (Override Tape File) command prior to the CPYF. The device copy commands let you specify most of the relevant device parameters as part of the copy command.
The tape copy commands include all of the options used with tape files, such as label type, record and block length, and so on. I won't go into that here, since I must figure out each parameter every time I use these types of copies. Bear in mind that if your object is to make backups of files, it is simpler to use the SAVLIB (Save Library) or SAVOBJ (Save Object) commands.
The CPYSPLF (Copy Spooled File) command is similar to the S/36 version. You can copy to a data base file, which you would then presumably manipulate in some program.
Finally, there is CPYSRCF (Copy Source File), which takes the place of the S/36 LIBRLIBR (at least as far as copying source is concerned). This command includes the FROMMBR/TOMBR options described above, so it can be used when you need to manipulate more than one source member. To copy compiled programs, you use the CRTDUPOBJ, (Create Duplicate Object) command.
By now, you have gotten the idea that CPYF and its variants cover quite a bit more than their S/36 equivalents. Copying data files is easier, since you can perform practically all of the S/36 equivalents with one command. The various library manipulations commands are also included: rather than FROMLIBR, TOLIBR, and LIBRLIBR, you have options of CPYF. CPYF can also be used in place of LISTDATA and LISTLIBR. So get out the CL Reference manual and read through the section on CPYF. There are several examples given for the command, and you should work through many examples on your own data.