What happens when multiple jobs compete for resources? It's all about the routing.
Things don't always go according to plan on IBM i. Or do they? While the instinct for administrators and IT managers is to always hunt down a culprit—a rogue job, an inactive journal receiver, or something else—sometimes the very building blocks of a common process, or rather the specifics that define processes, can be where the trouble at hand resides.
In such situations, these core definitions are often the last places to seek out the source of an issue as they are not problematic in themselves but can be problematic in certain circumstances. Class (CLS) object type is a good example of a small definition that packs a big punch when it comes to cause and effect of problem jobs in an IBM i environment.
What is CLS? Essentially, a CLS contains the parameters that control the running of a routing step. This criteria dictates the "what happens next" process for jobs. Breaking that process down in step-by-step manner, we can see the cause and effect in motion: When you submit a job via SBMJOB > the RTGDTA parameter identifies > the Subsystem Routing Entry > which in turn identifies the program that the job will run in.
Many IBM subsystems ship with a "catch all" *ANY routing entry. For this example, we'll use the "catch all" subsystem QBATCH.
In the example, if a SBMJOB is run with any routing data other than the three compare values shown (QIGC, QS36EVOKE, and QCMD38), then the job will be matched against routing entry 9999, and the entry will be used to supply various run-time values when the job executes.
So, going back to our SBMJOB command: The job queue directs the job to run in a particular subsystem, as shown below:
As the next screenshot shows, the routing data directs the job to the required subsystem routing entry. But why is this important?
The reason is class: The subsystem routing entry contains the class that will be used when this entry is selected. In our example, the routing entry of 9999 will select class QBATCH.
Class Values: In our examples, we've been looking at class QBATCH. The object's class contains a number of values that can in themselves determine what happens next, and further, what happens when multiple jobs compete for resources. A closer examination of the most important class values demonstrates why.
Run priority: This is the run priority that jobs will use when using this class. (Now you know why most batch jobs run at priority 50!) A job with a run priority of 20 will be treated with more importance than one with 50 if both are competing for resource at the same processing time.
Time slice in milliseconds: A job with a large time slice will stay in memory longer and hence use more CPU. Obviously, if there's only a single job running, then changing the time slice has very little effect. However, if there are multiple jobs with the same run priority and one job has a high time slice, this would result in the other jobs being ineligible, assuming there is insufficient memory available. This explains why you may experience high A/I (Active to Ineligible) and W/I (Wait to Ineligible) figures from time to time.
This may also help to explain why users may run a CHGJOB before executing an interactive query. Generally, changing the time slice for an interactive job has little effect, particularly on systems where response time is fast. An increased time slice, however, will have a big effect on system resources for interactive jobs when they're performing a lot of processing after pressing Enter, as in the case of query.
Eligible for purge: Indicates whether or not this job is eligible to be moved out of main storage (memory) and put in auxiliary storage at the end of a time slice.
If your end of day is suddenly taking longer, it might be worth checking to see if there are other jobs with higher run priorities. If so, investigate the CLS being used for these jobs, as these may be using resources, resulting in longer-running jobs. Alternatively, you could also see if the time slice is high, or maybe the "eligible for purge" is set to No.
Where issues pertain to temporary storage or CPU limits, it's also worthwhile using Class parameters to assist as Maximum CPU time in milliseconds can control how much CPU a job can use before the system will end it, and the maximum temporary storage parameter in megabytes can be used to control the amount of temporary storage a job can use before the system will end it.