23
Tue, Apr
0 New Articles

Getting Started with Performance Analysis

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times
The task of doing performance analysis can be overwhelming. If you don't ask yourself some key questions, you can waste a lot of valuable time. This article will assist you in determining the possible contributors to performance problems so you will know where to place your focus.

Understanding the Components of Performance

A transaction and the timeline to service this transaction or another unit of work have several different components. These components consist of input/output, CPU, disk, memory, activity level, seizes/locks, network, and errors. Think of each of these components as a server. Requests for service flow through the system from server to server, and some servers can have multiple requests at one time. One or several of these components can be a bottleneck and a source of the problem for some transactions or certain units of work. The key is to eliminate as suspects those components you know are not causing the problem and then try to isolate the specific component(s) contributing to the performance problem.

Focusing on the Problem

Sometimes you hear "the system is slow" or "my job is taking longer to run than it used to" or "there's a slowdown on system xyz." Before digging into the problem, it's a good idea to consider what might be happening.

  • Does the problem appear to be affecting only batch, only interactive, a specific subset of jobs, or all jobs on the system?
  • Does the problem happen only at certain times of the day, week, month?
  • Does it happen all the time or intermittently?
  • Did the problem just start happening? If not, when did it start? How long has it been occurring?
  • What, if anything, has changed on the system since the problem(s) started? Do the changes coincide with the beginning of poor performance?
  • Is the system partitioned? If yes, there are different questions to ask: How much Commercial Processing Workload (CPW) is allocated to this partition? How much Interactive CPW is allocated to this partition? Are the processors shared or dedicated? There is some overhead associated with shared processors.

System Evaluation and Areas of Consideration

Once you have some answers, there are a few things to check and consider:

  • What is the system model? What is the processor feature? What is the interactive feature? This will determine total CPW and interactive CPW available for this model. If interactive response time is a problem, the system might be hitting its interactive threshold. This is indicated by message CPI1479 in the history log.
  • What is the OS/400 release? This may determine whether the problem(s) might be release-related.
  • What is the database group PTFs level? This may determine whether the system is back-level or current with database PTFs.
  • What is the cumulative PTF package level? This may determine whether the system is back-level or current in OS/400 and Licensed Internal Code (LIC) PTFs.
  • Adjusting system values such as QMCHPOOL, QMAXACTLVL, QPFRADJ, QACTJOB, and QTOTJOB can improve performance and resolve certain performance problems.

QMCHPOOL represents the size of the machine storage pool. If the value is set too small, system performance can be severely inhibited. For new systems, the recommended initial size of QMCHPOOL is at least two to three times the reserved size of the pool, depending on faulting in the pool.
QMAXACTLVL represents the number of jobs/threads that can simultaneously compete for memory and CPU. QMAXACTLVL should be set to *NOMAX. Then, use the activity levels within the Work with Shared Storage Pools (WRKSHRPOOL) command to control the activity level of each system pool.
QPFRADJ dynamically adjusts (at a rate of approximately every 20 seconds) memory and activity levels for all shared pools on the system. QPFRADJ features configuration parameters via WRKSHRPOOL with F11, which allow administrators to establish priorities for adjusting shared pools, set minimum and maximum pool sizes, and determine pool-faulting and job/thread levels. While setting these parameters effectively overrides QPFRADJ's ability to adjust memory of shared pools, QPFRADJ is still worthwhile because it allows the system to benefit from expert cache.
QACTJOB represents the number of active jobs in which auxiliary storage is allocated during an IPL. The value should be set about 10% greater than the number of active jobs found on the Work with Active Jobs (WRKACTJOB) command during the busiest time in a day.
QTOTJOB controls the amount of auxiliary storage that is allocated during an IPL. All jobs are included in this value (i.e., active jobs, jobs on job queues, and jobs having spooled files associated with them). A recommended setting is 10-20% of the total number of jobs found on WRKSYSSTS display.
QADLACTJ and QADLTOTJ correspond to QACTJOB and QTOTJOB, respectively. QADLACTJ and QADLTOTJ control the additional number of jobs auxiliary storage is allocated for when the total value for either system value has been met. Allocation is performed as soon as the storage is needed, so how these values are set can significantly impact performance. Keep these values set at a reasonable number. For example, say QACTJOB is set at 100 and QADLACTJ is set at 10, and 99 jobs are active on the system. If two more jobs are started, bringing the total active jobs to 101, QADLACTJ will create additional auxiliary storage to handle 10 more active jobs.
QDYNPTYADJ controls whether the priority of interactive jobs is dynamically adjusted to maintain high performance of batch job processing on AS/400e server model hardware. This adjustment capability is effective only on systems that are rated for both interactive and noninteractive throughput and have Dynamic Priority Scheduling enabled.
QDYNPTYSCD allows you to turn on/off the dynamic priority scheduler. The task scheduler uses this value to determine the algorithm for scheduling jobs running on the system.

  • Check the QEZDEBUG output queue. Are there any dumps there? If yes, do they have the same error?
  • Check the QEZJOBLOG output queue. Are there a number of large QPJOBLOG files? Are there a lot of error messages in them? Is CL logging turned on?
  • Is CPU running too high with poor response times? If yes, this will help focus on CPU as a possible bottleneck.
  • Is CPU barely running at all with poor response times? If yes, then eliminate CPU as a bottleneck and think about other possible resource constraints, such as disk, IOP, seizes/locks, and/or network.
  • Is disk utilization above the 40% guideline? Are there enough disk arms? As higher capacity disk devices for the iSeries systems become available, fewer arms are needed to satisfy the capacity requirements. This can lead to configuring too few disk arms to meet the workload demands placed on them. A lack of disk arms can bottleneck the processor's performance. To avoid such a bottleneck, a minimum number of disk arms are needed for optimum performance on each processor. This number is independent of the quantity of drives needed to meet the desired storage capacity. (Click here for the online disk arms calculator.)
  • Is disk response time too high? The suggested guideline, on average, is that disk response times be below 10 ms (.010).
  • Is machine pool faulting too high? The guideline for the machine pool is < 10 faults per second. If QPFRADJ is turned on, pool adjustment is automatic. However, you can use the WRKSHRPOOL command with F11 to set the minimum and maximum sizes of each pool. (Save a copy of the original screen for future reference before making changes.) Use this guideline for the minimum amount of memory in the machine pool: (2 * the reserved size of the machine pool) / Total amount of memory on the system. This will give a percentage to be used as the minimum size. Give a maximum size of 100% to the machine pool. The other pools should have the maximum set according to a reasonable understanding of need. Be careful setting the minimum size for Interactive. If you make it too low (and if the priorities are equal), memory will oscillate between batch and interactive.
  • Where is the majority of the workload running? (For instance, are all jobs running out of *BASE, or are different subsystems using different pools? Are the pools private or shared?) If the majority of the workload is running out of *BASE, there is the possibility of many jobs competing for the same resources on a busy system. Consider separating batch work from other work on the system.
  • File sizes: How big are the files used most often? (Use DSPFD to find out.)
  • Number of deleted records: Large files (gigabytes) with large numbers of deleted records (almost half or over half are deleted records) could be a problem. If an application is doing a full table scan over these files, over half the records being read aren't even useful. These files would be good candidates for reorganizing (RGZPFM).
  • Is expert cache turned on? Expert cache works by minimizing the effect of synchronous DASD I/Os on a job. Best candidates for performance improvement are jobs that are most affected by synchronous DASD I/Os. Once started, expert cache monitors the DASD I/O activity and logical reference pattern for each database file that is accessed within a shared storage pool. Then, it dynamically adjusts the size and type of I/Os for these files to maximize the use of main storage and minimize the number of DASD I/Os. Reducing the number of DASD I/Os, particularly synchronous I/Os, can result in quicker processing. For interactive jobs, this generally means better response time. For batch jobs, it can mean completing current batch work in less time or doing additional work within an existing batch window.

Performance Data Collection

Once you have some answers and know some key things about the system, the next step is to consider the collection of performance data. Keep the following in mind when considering data collection:

  • What type of data needs to be collected? System level data includes pools, disk, CPU, and communication lines. Application level data includes specific jobs, programs, procedures, and subsystems.
  • What level of data needs to be collected? General, specific, Performance Explorer (PEX), sample/trace data, other.

A number of tools allow interactive review of system performance:

  • WRKSYSACT is the quickest way to analyze a problem situation. It shows only the jobs that have been active during the last observation interval. And it uses fewer system resources than the other commands discussed here. (Note: Performance Tools Licensed Program Product - 5722PT1 is required.)
  • WRKSYSACT's View 4 lists allocated and deallocated storage assigned to a job/task. If a runaway job is long-running, you may be able to identify it here. The Storage field shows storage usage information that can be sorted by allocated storage, deallocated storage, and net storage to help detect jobs that are using large amounts of storage. Jobs where allocated storage is increasing dramatically are candidates for further investigation.
  • WRKSYSSTS shows the number of jobs in the system, disk usage in system ASP, and the number of addresses used. All memory pools, database and non-database faults, and activity level changes can be monitored at a glance. If there is only one pool with a high non-data base faulting rate, find out which subsystem uses that pool and monitor that subsystem with the WRKACTJOB command to find out what jobs are active.
  • WRKACTJOB is used to examine CPU used and disk I/O operations done by each job currently active. Rearrange this display to CPU % and find the largest amount of CPU. Information about response time, run priority, and the pool in which the job is run are also displayed. The result is the average amount of I/O during the observation period.
  • WRKDSKSTS shows performance and status information about disk units on the system. Pay attention to column "% busy." Use it as an indicator to look at the System or Component report. Do not use these values for capacity planning.
  • DSPPFRDTA can be used to analyze either real-time data or data previously collected. (Note: Performance Tools Licensed Program Product - 5722PT1 required.)

Performance Tools Available

With the variety of applications that can run on the iSeries, system performance problems don't always yield easy solutions. Of the many tools available, it's sometimes difficult to determine which to use. Here is a list of tools and their recommended order of usage. Use this list as a guideline to help you get started.

1. Performance monitor (STRPFRMON prior to V4R4) or Collection Services (STRPFRTRC V4R5 or higher). Use Performance Tools LPP (5722PT1) to run reports over collected performance data.
System Report

  • Component Report
  • Job Summary Report
  • Transaction and Transition Report (from trace data only)

2. WRKSYSACT--Display the data or put it into an outfile.
3. WRKACTJOB
4. WRKJOB--Within the Job Watcher under the iDoctor tool set, you can watch a specific job or set of jobs and/or do a system-wide watch to gather statistics over all the jobs on the system. Within the PEX Analyzer under the iDoctor tool set, you can collect various types of information:

  • PEX Stats Flat is one of the best tools to get a system-wide view of the most active programs, so it's good to use if you don't know where to start. It shows what programs and/or MI instructions are using the most CPU, the call count for each program, and disk I/O activity. Based on this information, you can identify which programs should be investigated further. Try to run it during a heavy/peak workload.
  • PEX Stats Hier is one of the best tools to see program activity in a particular job. It shows call/return flow of programs within a job, call count for each program, CPU usage, and disk I/O activity for each program, as well as CPU used by each job in the call.
  • PEX Profile identifies high-level language statement hot spots (high CPU consumption) in programs or service programs. It gathers CPU usage information over a selected set of programs or service programs.
  • PEX Task Switch Trace identifies a number of run-time situations, regardless of what job or task they occur in. It answers questions such as why is a job waiting, who/what is it waiting for, who/what woke it up, and what were they doing up until now? Note: Because of the large amount of data collected, this should be run only for very short periods of time on larger systems and not over all jobs.
  • PEX Stats Hierarchical's usefulness depends on what you find.

The Tools to Get Started

You should now have a better indication of what the problem might be and where to place your focus. You can feel confident knowing that many performance tools are available to assist in your analysis, from a high-level overview to a very low-level, detailed view.

Sandi Chromey is a Senior IT/Architect Specialist with IBM Global Services. She provides performance support to both internal and external customers within IBM Global Services. Sandi has been with IBM for 22 years of which 11 years have been in IT. She also has experience in iSeries development and component testing.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: