Find characters that don't belong!
I recently received the following from Wayne R.: "I use RPG's %CHECK function to check if any characters are not valid. Is there something similar in CL?" RPG's %Check built-in, if you're not familiar with it, returns the first position in a variable that contains a character that is not in a list of valid characters. If, for instance, you had a variable with a value of 'cabbage', then comparing this to a list of valid values such as 'abc' would return a position value of 6. This is due to the sixth character (the "g") not being in the list of valid values ('abc'). If all characters in the variable are found in the valid character list, %Check returns a position value of 0. While CL currently does not have a built-in such as RPG's %Check, you will see it's not that difficult to create a CL command to provide this function for programming our IBM i—and in fact that's what we'll do in today's article.
The CHKCHR Command
The command we will create is Check Characters (CHKCHR), and this is the source for the command:
Cmd Prompt('Check Characters')
Parm Kwd(Comparator) Type(*Char) Len(192) Min(1) +
Parm Kwd(Base) Type(*Char) Len(1024) Min(1) +
Prompt('String to be checked')
Parm Kwd(Pos) Type(*UInt4) Min(1) +
Prompt('Pos of first invalid character')
The CHKCHR command is loosely patterned after the RPG %Check built-in. The first parameter, Comparator, is a character variable set to the characters that represent "valid" characters; it maps to the first parameter of the %Check built-in. I rather arbitrarily defined this parameter with a maximum length of 192 bytes as that is the largest number of displayable characters available with a single-byte EBCDIC CCSID. You can certainly increase or decrease this size if you so desire.
The second parameter, Base, is the character variable that is to be tested for any characters that are not found in the Comparator value; it maps to the second parameter of the %Check built-in. As with Comparator, the declared length of 1024 is totally arbitrary. You can increase or decrease this size to whatever length is appropriate for your use of the command.
The third parameter, Pos, is a return variable and will identify the first character found in Base that does not exist in Comparator. As with the %Check built-in, if all characters of Base are found in Comparator, then Pos will be set to 0. Otherwise, Pos will identify the first position of Base not in Comparator.
The %Check built-in does provide one parameter that is not being implemented today—namely, a starting position within Base. The %Check built-in defaults to a starting position of 1, which is where our CHKCHR command as currently written will always start, but you can specify other values with the RPG built-in. The ability to specify a user-selected starting position with the CHKCHR command is not difficult to do and can be demonstrated in a future article if there is interest in such a capability. For now, I'm ignoring this starting position capability in order to keep the discussion of the CPP straightforward.
Assuming that the previous command source is stored in member CHKCHR of source file QCMDSRC, you can create CHKCHR with the following command:
CrtCmd Cmd(ChkChr) Pgm(ChkChrCPP) Allow(*BPgm *IPgm *BMod *IMod)
This CRTCMD indicates that the command processing program (CPP) of CHKCHR is program CHKCHRCPP and that the command can be run from within a program or a module. Due to Pos being a return value, you will not be able to run the CHKCHR command interactively from a command line.
The CHKCHRCPP Command Processing Program
Below is the source for our CPP, ChkChrCPP.
Pgm Parm(&Comparator &Base &Pos)
Dcl Var(&Comparator) Type(*Char) Len(192)
Dcl Var(&Base) Type(*Char) Len(1024)
Dcl Var(&Pos) Type(*UInt)
Dcl Var(&Null_Comp) Type(*Char) Len(193)
Dcl Var(&Null_Base) Type(*Char) Len(1025)
Dcl Var(&Null_Char) Type(*Char) Len(1) Value(x'00')
Dcl Var(&Len) Type(*UInt)
ChgVar Var(&Null_Comp) Value(&Comparator *TCat +
ChgVar Var(&Null_Base) Value(&Base *TCat &Null_Char)
CallPrc Prc('strspn') Parm((&Null_Base) (&Null_Comp)) +
CallPrc Prc('strlen') Parm((&Null_Base)) RtnVal(&Len)
If Cond(&Pos = &Len) Then( +
ChgVar Var(&Pos) Value(0))
Else Cmd(CHgVar Var(&Pos) Value(&Pos + 1))
Assuming that the preceding source is stored in member CHKCHRCPP of source file QCLSRC, you can, if your system is V6R1 or higher, use the following command to create the CHKCHRCPP program:
If you system is V5R4 or earlier, you will need to use the two following commands to create the CHKCHRCPP program:
CrtPgm Pgm(ChkChrCPP) BndDir(QC2LE)
If you're not interested in how CHKCHRCPP works, you can now jump to the section titled "Testing the CHKCHR Command." Otherwise, just continue reading.
The CHKCHRCPP program declares three parameters (&Comparator, &Base, and &Pos), which respectively define the three keywords of the CHKCHR command. The program also declares four internal variables (&Null_Comp, &Null_Base, &Null_Char, and &Len), which are used in conjunction with two system APIs that are called as part of CHKCHRCPP processing.
These APIs, which are standard in the i operating system and considered part of the C language run-time, are strspn and strlen. The strspn API returns the initial length (or span) of the Base variable string, which consists only of those characters that are found in Comparator. The API is documented here and, like most C run-time APIs, does not provide for an explicit parameter indicating the length of the Base variable. Instead, the API calculates the length of the parameter based on finding a null byte (a byte with a value of x'00') within the variable. The null byte then indicates the end of the character variable. To insert this null byte into the Base and Comparator values, the program sets the variables &Null_Comp and &Null_Base to the values of &Comparator and &Base, respectively, with the variable &Null_Char being concatenated with blank truncation. The key item about &Null_Comp and &Null_Base is that they are declared as being 1 byte larger than the corresponding non-null terminated variables &Comparator and &Base. This additional byte in their declared lengths is to ensure that there is always room for the ending null byte.
The strspn API is then called, passing these null-terminated values, and the API returns &Pos. &Pos, as mentioned earlier, represents the number of character positions spanned within the base argument where the character is found within the comparator argument. So a &Pos value of 0 indicates that the very first character of the base string was not in the comparator, a value of 3 that the first three characters of the base string were in the comparator but the 4th character was not (assuming the base argument is greater than 3 bytes in length), and a value of &Pos that is equal to the length of the base argument means that all characters in the base string were found in the comparator. These returned position values are not at all the same as the values returned by the RPG built-in %Check, but we'll be making the necessary adjustments shortly.
To determine whether or not invalid characters were encountered by the strspn API, we now need to determine the length of the base argument. To do this, the strlen API is called. The strlen API, documented here, returns the length of a character string, excluding the ending null byte (&Null_Char). The strlen API is called, passing the null-terminated variable &Null_Base, and the API returns &Len. &Len is now the length of the base argument, not including the null byte terminator.
CHKCHRCPP then compares the values of &Pos and &Len. If they are equal, then no "invalid" characters were encountered and the program returns the value of 0, compatible with the %Check built-in. If &Pos and &Len are not equal, then the program adds 1 to &Pos and returns this value. As &Pos prior to the addition represents the number of initial characters in &Base that were "valid," adding 1 identifies the first character that was not "valid."
Testing the CHKCHR Command
As the CHKCHR command cannot be run from the command line (due to the return parameter &Pos), we will write a simple CL program (UseChkChr) that can be called from the command line. The program will accept two parameters—the comparator string and the base string—and display a message related to the results of the CHKCHR command that is run. This is the source for USECHKCHR:
Pgm Parm(&Comp_In &Base_In)
Dcl Var(&Comp_In) Type(*Char) Len(32)
Dcl Var(&Base_In) Type(*Char) Len(32)
Dcl Var(&Pos) Type(*UInt)
Dcl Var(&Pos_Char) Type(*Char) Len(5)
ChkChr Comparator(&Comp_In) Base(&Base_In) Pos(&Pos)
If Cond(&Pos *NE 0) Then(Do)
ChgVar Var(&Pos_Char) Value(&Pos)
SndPgmMsg Msg('Invalid character ' *Cat +
%sst(&Base_In &Pos 1) *Cat +
' at ' *Cat +
Else Cmd(SndPgmMsg Msg('Everything is OK') +
Assuming that the preceding source is stored in member USECHKCHR of source file QCLSRC, you can use the following command to create the USECHKCHR program:
From the command line, we can now test a few scenarios.
Entering the command Call UseChkChr ('abc' 'cabbage') will result in the message "Invalid character g at 00006" as the sixth character of 'cabbage' (the "g") is not in the list of valid values ('abc').
Entering the command Call UseChkChr ('abc' 'a cabbage') will result in the message "Invalid character at 00002" as the second character of 'a cabbage' (the blank space) is not in the list of valid values ('abc').
Entering the command Call UseChkChr (' abc' 'a cabbage') will result in the message "Invalid character g at 00008" as the eighth character of 'a cabbage' (the "g") is not in the list of valid values (' abc') where a blank precedes the 'a'. Note that, due to the use of *TCAT when concatenating &Null_Char to &Null_Comp, we do not want the blank character to be the last character of valid characters (as it will be removed during the concatenation operation).
Entering the command Call UseChkChr (' abcge' 'a cabbage') will result in the message "Everything is OK" as all characters in the string 'a cabbage' are now found in the comparator string.
Entering the command Call UseChkChr ('0123456789' '1,234.50') will result in the message "Invalid character , at 00002" as the second character of '1,234.50' (the comma) is not in the list of valid values ('0123456789').
More CL Questions?
Wondering how to accomplish a function in CL? Send your CL-related questions to me at