Unconfigured Ad Widget

Collapse

Announcement

Collapse
No announcement yet.

Non-DB Faults/Pages are pretty high, concern?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Non-DB Faults/Pages are pretty high, concern?

    Mike Strozier wrote: I've ran a conversion program that is basically a simple map/drop from one file to another. Out of 180,366 records the first run 1 to 1 correlation. Cleared the file and ran again. Was short 2k records. Cleared and ran again, same situation but was alittle less than 2k. Ran again and was back at 180,366. This is troubling and confusing. Does your conversion process close all files? Are you running in the default activation group? When you check your results are all processes finished? Perhaps you can post some code or further explain what is taking place. Dave

  • #2
    Non-DB Faults/Pages are pretty high, concern?

    Yes the process has successfully ended when I checked my results. I am not using any read statements to access the files rather I have my main file set as IP and all others as IF (except the one I am writing to, UF A for that one) and chain to them based on their corresponding keys. We are confused as it is occuring with some of the other programmers as well, including one stating when she is running an interactive debug, as she steps through the code it would skip parts. She recompiled the program, got the message it completed successfully, tried again and same results. It has become quite concerning as we work with patient data. It's been suggested that the DB faults are potentially throwing us off, however I do not know enough about that area to know if that is the case or not. Thanks

    Comment


    • #3
      Non-DB Faults/Pages are pretty high, concern?

      Mike Strozier wrote: > We are confused as it is occuring with some of the other programmers > as well, including one stating when she is running an interactive > debug, as she steps through the code it would skip parts. She > recompiled the program, got the message it completed successfully, > tried again and same results. > > It has become quite concerning as we work with patient data. It's > been suggested that the DB faults are potentially throwing us off, > however I do not know enough about that area to know if that is the > case or not. I am by no means an OS specialist, maybe Hans could shed some light. But unless the code was compiled to high optimization, you shouldn't ever see the code skip lines - unless there was a mismatch between object and source. One other slight possibility would be a small injected /copy segment, but that'd be about it. If this all is truly happening without a source/object mismatch problem, I think you need to get IBM SW support on the line. Bill

      Comment


      • #4
        Non-DB Faults/Pages are pretty high, concern?

        Put the programs in debug mode to determine what is happening. Run the batch programs specifying *SECLVL to produce a detailed job log. If this does not point to a simple culprit call IBM support. Dave

        Comment


        • #5
          Non-DB Faults/Pages are pretty high, concern?

          I'm not a system guy, as in this is the first time I ever read anything on db fault since I've been on the system and I had to look it up. But your problem is interesting. First, the word fault is misleading. It implies error, but it's just a term for going to disk to get what is being sought versus already residing in memory. For example, records are brought up in blocks, so the first record is a fault but additional records in the block are already in memory so there is no db fault to access one of them while it's still in memory. Bottom line, there is no error at any point, it merely indicates the numbers of accesses to memory or disk. Lots of disk access would imply more memory needs to be allocated for the work being done. That's enough system stuff to start with. Now first is integrity, second is perfomance. Concerning integrity, implied is data loss of around 2k records or so out of 180,000 records in a simple run through of read and write. Also repeatably flaky behavior (which is mutually exclusive, by the way) in debugging sessions of a program. The F specs given were UF A for output, but the description didn't sound like an update actually takes place. This would be a de facto O spec, which also means buffering. What is critical is what 2K records are missing, interspersed or the last 2K records? For probably the same reason why there is a high fault rate, the system is probably slow in batch in finalizing the buffer into the output file, but at a higher priority running interactively gets the time needed to finish in a timely manner. Who knows, an abnormally low batch system priority may also be contributing to the high turnover in memory of the batch jobs. On the other hand, based on what is being measured and for how long, it may just be accurately reflecting new program and data reads into memory as required if isn't already there. You would easily be able to determine all that. Do a couple of things. One is make your F spec a real UF A (if it isn't, which I'm betting it isn't) by putting an UPDATE statement in an uncalled subroutine. This will force blocking off and your file be appended more in real time. Of course the whole program will run slightly slower as well, but you will be able to see if the 2k records were the last block slow in being finalised to the output file. That is quick and painless, but should it not be that simple, create a two field file and add a count variable at top below the IP and a line to the program to write the current count and RRN of each record read following the write statement to your output file. This will show if any records are skipped, which ones, etc. For the debugging thing, absolutely impossible to have a replicatable glitch that, for example, skips lines in a debugging session repeatedly because of flaky system behavior. Anyone who has ever tried to reproduce a glitch knows that only too well. As suggested, most likely cause is a mismatch of the source code and object being debugged. Assuming there is an area that was involved in both skips, put something exotic like a DUMP(A) in that area and recompile. First you'll quickly find out if you're debugging the version you just compiled, and secondly, should it skip after the DUMP you'll have info of what's going on which should explain the path the program is taking. These two things will probably assure of integrity, and in the process you'll probably figure what the reason was for the performance issues. But if not, priority, current memory situation, and repeated testing of the same process to see it's effect on the fault rates would be the first few things I'd check on the cause of the high fault rates. Hope that helps. rd

          Comment


          • #6
            Non-DB Faults/Pages are pretty high, concern?

            On our development box (only 5 programmers access this box) our Non DB faults /pages for *machine is 40/41.4, varys but thats about the average number presently. However *base (assume this is back, its the 2nd pool, has Max active 170, DB faults/pages 23/72 and Non DB faults/pages 98.6/217 This morning the non-db faults were 126/316 Some problems we have seen lately is when running several conversion programs in a batch process, it appears the programs are skipping lines of code and in some cases skipping records. I've ran a conversion program that is basically a simple map/drop from one file to another. Out of 180,366 records the first run 1 to 1 correlation. Cleared the file and ran again. Was short 2k records. Cleared and ran again, same situation but was alittle less than 2k. Ran again and was back at 180,366. If I ran the program interactively it worked just fine, however slowly. Is the non-db faults/page possibly the problem? And what are some suggestions to correct (how to?) the db faults if that is in error. Thanks

            Comment

            Working...
            X