PDA

View Full Version : New Dog with an Old Trick?



Guest.Visitor
01-01-1995, 02:00 AM
Maybe simplicity isn't the best thing, but there is a function that I need and miss terribly from the S/36 OCL days. Does anyone remember the IF ACTIVE- comand? Well, this was a nice little feature that would tell you if a particular procedure was still currently active (running), thus giving you the option to either run or not run certain job steps based on the completion of the active job. This is a feature that I believe is needed on the AS/400, but debate has been on how to implement it. Creating Data Area in the CL that calls a procedure then testing for that Data Area in subsequent procedure calls is a nice feature, but what about if the job terminates abnormally? There's no way to find out if that's a TRUE lock out or not. What if you don't have access to the calling CL just the Object Code but still have to 'WAIT' on the job to complete before you perform another job? Schedule the job you say? This one answers itself. You still have no clue as to whether the job completed. Message waiting. System resouce conflict delays completion, etc... Why couldn't the CHKOBJ command be expanded to incorporate this? The Object type could be *ACTJOB with a return Message ID that indicates whether the job is ACTIVE or not. Maybe there's more to it than just thinking simply, but sometimes it would be nice to teach a NEW DOG and OLD TRICK. Rick Any comments are welcome, but a solution that would help me bypass this missing command would be worth the Kudos!!

Guest.Visitor
10-04-2000, 08:12 AM
I have the source code for a command to do this on my web page. Go to ... http://www.geocities.com/SiliconValley/Pines/9037 Take the "AS/400 Utilities" link and look for the ZRTVJOBSTS command.

flensburg@novasol.dk
10-04-2000, 08:19 AM
Hello Rick, Here's one way to do it: http://archive.midrange.com/midrange-l/199911/msg01701.html Best regards, Carsten Flensburg

Guest.Visitor
10-04-2000, 09:53 AM
One thing you can do with the data area is: Rather than check for it's existance or the contents of, have the program that is running lock the object(data area) exclusively , any other program can try to allocate the same object, if the allocate fails the program is still running. In the case that the program fails or is terminated abnormally the lock is released and therefore showing the program is not running.

Guest.Visitor
10-05-2000, 05:57 AM
Thanks for the help folks. I still like the CHKOBJ idea with objtype(*ACTJOB). Wonder if that's possible? Definately would make things much simpler... Oh well, I can dream... can't I?? Rick

D.Handy
10-05-2000, 08:53 AM
Rick, <font color=blue>"I still like the CHKOBJ idea with objtype(*ACTJOB). Wonder if that's possible?"</font> With all due respect, it is not possible to do that reliably. It would work *most* of the time, but would have an inherent window of exposure. Especially on large systems, scanning the Work Control Block Table(s) is not necessarily a fast operation (in computer terms). This is especially true for systems which have needed to spawn additional WCBT's or when the job tables have not been compressed recently. Ever notice how over time, the initiation of WRKACTJOB or similar displays seems to get slower? This is the result of the WCBT's not being compressed. On V4Rx something, the commands DSPJOBTBL and CPRJOBTBL were added to let you monitor and correct this situation. On V3Rx, you could create a data area QWCBTCMPTB with a value of '1', and then at IPL time the system would compress the job tables. (This may have been added by PTF's.) Compressing the job tables makes WRKACTJOB initiate much faster again. On V3Rx, I recommend just leaving the data area set to '1' (setting it to '0' or deleting the data area makes it not process the compress during IPL). On large systems with many hundreds of active jobs, scanning is inherently relatively slow anyway, like a smaller system with uncompressed WCBT's. Having something like CHKOBJ or IF ACTIVE which then conditions other code does not guarantee that the result is even accurate by the time command finishes, let alone by the time you use it to condition other code. I suspect this is why IBM never supplied the direct equivalent to IF ACTIVE -- it may work 99+% of the time, but what about the exceptions? The good news is that it's not hard to do what you want anyway, in a 100% reliable fashion. Look closer at what Greg and Carsten suggested. By using ALCOBJ instead of CHKOBJ you can guarantee only one job will be possible. The basic idea is to attempt an *EXCL lock. If it fails, another job is running. If it doesn't fail, you have the lock and there is no window of exposure where another job could concurrently reach the same condition. That leaves just a couple of issues. Which object to lock? There is no reason to use a data area since you don't need to store anything in it. So instead of creating an extra object (eg *dtaara), just lock a program or display file used by the process. The second issue which Carsten addresses is more subtle: if you use save while active, an *EXCL lock will prohibit the object from being backed up. Yet an *EXCL lock is the only one you can use to test for concurrent use. Look at Carsten's code. It works around this by attempting an *EXCL lock. If it does get it, it reduces it to a *SHRRD lock. This will let a concurrent save while active work, while still prohibiting another concurrent job since it would fail the initial *EXCL test. If you don't care about holding the *EXCL lock, all you need is the first ALCOBJ and the final DLCOBJ. This is barely harder than your proposed CHKOBJ, and has the virtue of simultaneously setting the lock to avoid the window of exposure I talked about above. Doug

Guest.Visitor
10-05-2000, 11:13 AM
Doug, I alwasy use the data area just because of the issues with the backups, the data area holds no data so it does not need to be backed up, plus my initial program would create it if it did not exist if for some reason it go deleted. I tried locking other objects but ran into the same issues you described plus others as well.

Guest.Visitor
10-05-2000, 05:32 PM
Why would you use these methods rather than the retrieve job APIs? Aren't you really checking to see if a PROGRAM is being used, not whether or not a particular JOB is active?

D.Handy
10-05-2000, 09:12 PM
Herb, <font color=blue>"Why would you use these methods rather than the retrieve job APIs? Aren't you really checking to see if a PROGRAM is being used, not whether or not a particular JOB is active?"</font> I presumed the business need was to avoid some process from running more than one instance concurrently. I can think of three basic ways of trying to accomplish this end: Set a flag condition in a data area or physical file Use an object lock to control concurrent instances Use the job APIs to see if another instance is active Each has its relative advantages and disadvantages. The first is the only way which will survive an abnormal termination. That could be an advantage, and when needed for restart considerations it is the only way to go. Note that it can be used in conjunction with either of the other methods to determine whether you are in a recovery or a normal initiation mode, and take the proper action. When previous abnormal terminations do not need to be detected, I'd prefer either of the other two options listed above. The reasons I prefer the object lock method to the job API method are:<ul> Guaranteed accuracy. An *EXCL lock can only be held by one job at a time, regardless of the size or performance of the system. Works equally well for batch or interactive jobs Very fast to process[/list] The potential problems with the job API method, as I see it, are:<ul> Possible to not be accurate 100% of the time Works better for testing a batch process than an interactive one Can be slower than simply checking for an object lock (and no, I don't have benchmarks)[/list] The possible inaccuracy arises from two users trying to start the job at nearly the same time. As the number of users grow, you have the double jeopardy of increasing the chance two people will try it at the same time plus the fact the test itself can take longer on systems with larger work control block tables. What seemingly works well in testing may, on rare occassion, not work in practice especially if you have hundreds of users. For a batch process with a unique jobname, it is relatively easy for the system to check the WCBT's for a match since it keeps an index by jobname (and another by user). But what if it is an interactive process or the job does not have a unique name to test? It is still possible with the job APIs, but is more effort than a simple object lock mechanism. The object lock method has the virtue of automatically releasing on abnormal terminations (a potential disadvantage) plus it is very easy to implement. If you don't need save while active, it is very nearly as easy as IF ACTIVE was in OCL. But the job API method is not without advantages too. It can return info on the job if desired, such as who started it or when. And it can detect a job waiting on a job queue, without the need to set a DTAARA or PF flag. So it all depends on what you are trying to do. To tell you the truth, I most often use the first method because it lasts across abnormal terminations. I have a set of routines with command wrappers I use to store and retrieve the status. It lets me automate recovery steps, if any, and also lets me track more than a simple active/inactive status. I actually use a physical file instead, and log stuff like user-id, device, timestamp, completion codes, etc. (I log both the start and the stop in the file.) For multi-step things, it lets me easily control the sequence and resume at the right point after handling any recovery steps needed. For each process, it also automatically keeps an archive of the last x number of times the job was run, by keeping a rolling history of the physical files used to store the job status data. This makes an easy way for me to verify claims by users of what they did when, just in case they don't remember correctly. <g> But to simply prohibit concurrent execution of some task without recovery considerations, I think using an object lock is the way to go. Isn't that what object locks are for? They are designed for controlling the allocation of a resource, in this case some task or job. The question arises as to which object to lock -- and Greg and Carsten offered some considerations on that. Doug PS - Just went back to check Rick's original post to see if he mentioned whether it was a batch or interactive process. Rick's need is apparently testing for the active status of a CL program for which he doesn't have the source, and presumably can't use RTVCLSRC. So in this case, it does appear using the job API's may have some merit unless he wants to wrap the execution of the CLP with another CLP using ALCOBJ/DLCOBJ logic.

Guest.Visitor
10-06-2000, 07:54 AM
Good points, Doug. If I remember correctly (it's been a while), the IF-ACTIVE checks whether or not a PROC is active. This equates more to a job than a program and that's usually what I'm checking for in my world. I have to challenge the "guaranteed accuracy" statement about the object lock method. Just because there's a lock on the program it does not mean the program is ACTIVE. Granted, it's LIKELY that's what's going on put there could be other reasons.

D.Handy
10-06-2000, 08:46 AM
Herb, <font color=blue>"I have to challenge the "guaranteed accuracy" statement about the object lock method. Just because there's a lock on the program it does not mean the program is ACTIVE. Granted, it's LIKELY that's what's going on put there could be other reasons."</font> The program object lock was in the context of explictly placing your own lock on it for the desired duration as the means to accomplish the pseudo-IF ACTIVE checking we were discussing. OS/400 does not actually even lock a *PGM object while it is in use (however it does a DSPF with *SHRRD). That is why a *REPLACE can move the old program to QRPLOBJ even while active. So in terms of the lock, it is only there because *I* put it there intentionally. The "guaranteed accuracy" was in context to how *EXCL object locks work. It is inherently impossible for two jobs to both get it. It is theoritically possible for timing to be such that two jobs both test for a job to not be active using the other method. It may not be likely, but it is possible. Plenty of complications in coding are to deal with exceptions. In this case though it doesn't seem that much harder to just allocate whatever object, be it a data area or a program itself. Granted, the inability to get an *EXCL lock does not necessarily mean the program is running. For example, a save while active may have a temporary *SHRRD lock on the object. But the technique does guarantee two copies cann't be started concurrently. Doug