23
Tue, Apr
0 New Articles

The High Cost of Low Availability

General
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

We rely so heavily on information systems that a computer outage might leave thousands of employees with little or nothing to do. With competitors only a click away on the Internet, the potential loss of customer loyalty and revenue is possibly worse. Exactly how much money we're talking about depends on the situation, but there are ways to calculate the basic cost of downtime as you pitch high availability (HA) solutions to upper management.

Information availability continues to be a concern of IT managers, and as business shifts further to e-business, the consequences of system outages become increasingly ominous. AS/400 users may have the comfort of the world's most reliable hardware, but even an AS/400 is useless without data and applications.

Information systems have evolved significantly in the last few years. In the past, isolated applications filled purely operational functions in a single business area, their use restricted to a select group of employees. Today, information systems are broader in scope and more integrated. Often, an enterprise resource planning (ERP) system delivers most corporate functionality, but customers and suppliers can now use e-commerce to interact directly with their business partners' systems. The user is therefore not the only one who can be stymied by system outages.

In many sectors, such as the automotive industry, industry leaders have told their suppliers that, to continue to be their suppliers, they must conduct business electronically. IBM similarly encourages its Business Partners to use its PartnerWorld e-commerce sites to sell AS/400s. In either case, companies' systems are electronically linked to those of their supply chain partners or else they are out of business. In addition, applications are no longer purely operational. Corporate systems now feed information to data warehouses that serve the needs of executives and other strategists and tacticians.

Around the Clock, Around the Globe

E-business runs 24 hours a day, and people in different time zones expect equal treatment. Businesses running multiple shifts expect to be able to interact with your system whenever their workers are on the job, and consumers expect to be able to visit your Web site at their convenience, not yours. Automated, Web-based service allows a skeleton staff to provide around-the-clock support, but a 24-hour business cycle eliminates windows that used to exist for system maintenance and archiving. In this environment, if your systems stop, business stops.


Obviously, losing access to critical information for even short periods of time is no longer acceptable. You can't wait a day or a week to fix a crashed system and reload lost data; business information must be available whenever and wherever you need it.

Productivity

Productivity has always been an important management issue. In modern organizations, most, if not all, jobs depend on information systems to some extent. That dependence can be either direct, such as access to customer and product data for customer service representatives coming only through their computers, or indirect, such as an employee working from a paper report generated by a computer. Regardless of how strong the dependence, if a system outage lasts long enough, it will eventually idle all employees who rely on that system. Therefore, system availability is one of the most important productivity issues.

With hardware and software prices declining rapidly in the past decade, the cost of HA solutions is not most important, nor is revenue. The difference between the two matters most. A solution that costs $100,000 but prevents a million-dollar loss of revenue has good value; a million-dollar solution to a $100,000 problem does not. MIS managers need HA solutions that meet both top-line and bottom-line needs.

Business and IT Requirements

Business and technical environments and requirements determine the economically appropriate level of availability. Some organizations are still mostly "brick and mortar" businesses that operate Monday through Friday, 9 to 5. For these businesses, ensuring information availability during core business hours might be sufficient. For other companies, such as Web-based retailers, no downtime—planned or unplanned—is tolerable.

Regardless of business needs, IT departments cannot keep all systems running all the time. They must regularly take them out of service to, among other activities, reorganize underlying databases and upgrade hardware, software, operating systems, networks, or database management systems. Availability management accommodates both the IT need to maintain systems and the business need for continuous access to systems and information.

The Downtime Penalty

It is impossible to predict the cost of downtime accurately. Salaries of idle employees may be substantial, but they do not represent the largest liability. The true cost is opportunities lost. Consider a company that averages a gross profit margin of $100,000 per hour from Web and telemarketing sales. If its order processing systems crash for an hour, making it impossible to take orders, what is the cost of the outage? The easy answer would be $100,000, but that's wrong. Some customers will persist and either call or click back another time, so those sales are not lost; cash flow is simply delayed. However, other prospects and customers will give up and go to a competitor. Still, the value of purchases these customers would have made during the outage underestimates the true loss to the company. Satisfied customers become loyal customers, but dissatisfied customers or prospects who never become customers do not.

Consider a prospect who would have made an immediate $100 purchase and gone on to repeat that purchase once a year. Using a 15 percent discount rate, the present value of those purchases over 20 years would be $719.82. In this example, the company's loss is more than seven times the value of the first lost sale. (A lower discount rate produces an even higher value.)


Value

Reliability

Before you can calculate annual downtime costs for a particular system, you must first know its reliability (i.e., what percentage of the time it will be up). System reliability is less than you think. At a minimum, all of the following must work to use a network-based system:

• Power supplies (i.e., the municipal grid or UPS feeding all relevant servers and the client computer, as well as the power supply component inside all participating machines)

• CPU (in all relevant servers and client computers)

• Operating systems running all participating systems

• Server disk drives

• Database management system on the servers

• Application software

• Network switching and routing devices

• Network connections

A quote of 99 percent CPU reliability, for example, does not mean that your system will experience 99 percent uptime. If a system depends on 10 components, each of which is 99 percent reliable, reliability for the whole system is 0.9910, or about 90.44 percent. You would therefore expect this system to be unavailable about 9.56 percent of the time. In a 24-hours-a-day, 365-days-a-year environment, this translates to almost 838 hours of downtime each year.

Calculating Hourly Costs

While it is impossible to predict the actual loss from an outage, it is important to derive reasonable estimates. Only then is it possible to evaluate the economically appropriate level of investment in an availability solution. Losses in labor, revenue, and service all contribute to the total cost of an outage. A good starting point for evaluating these factors is to collect statistics on both the duration and the associated costs of past outages as recorded by your accounting team.

Employees generally continue to receive full pay even if a hobbled system cripples their productivity. Begin by examining how many, which, and to what extent employees were affected by past outages. Since businesses want a profit, the value employees contribute is usually greater than the cost of employing them. This is therefore a conservative approach to evaluating the cost-effectiveness of availability management.

To calculate the average labor cost of an outage, use this equation: Labor cost = P * E * R * H, where P = number of people affected, E = average percentage they are affected, R = average employee cost per hour, and H = number of hours of outage. Since labor costs and the extent of the effect of an outage vary, you should repeat this equation for each department and employee classification. However, remember that this is an estimate. Any shortcut that groups similar employees into a single class is sufficient.

Labor

Revenue


The simplest way to calculate potential revenue losses during an outage is to use this equation: Lost revenue = GR/TH * I * H, where GR = gross yearly revenue, TH = number of total yearly business hours, I = percentage impact, and H = number of hours of outage. The first two elements of this equation (GR and TH) provide an estimate of the revenue generated in an hour. Percentage impact (I) is an adjustment that allows you to scale hourly revenue based on your best estimate of both your ability to recover business lost during an outage and the lifetime value of customers permanently lost to competition. Consider, for example, a wholesale distributor whose customers place orders through a toll-free number. If customers receive the message to call back later, because of a computer outage, and no one calls back, the impact is 100 percent. If half call back, the impact is 50 percent, etc.

The table in Figure 1, based on a disaster recovery study done in the late 1980s at the University of Texas, Arlington, provides typical impact percentages by industry. The longer and more frequent the outages, the more customers become frustrated. Therefore, impact depends on the total duration of all outages over a year.

Significant changes since this study indicate that you should use these numbers only as rules of thumb. For a more precise assessment of the impact on your business, begin with an impact factor of 100 percent. Then, look at what happened to order activity immediately after an outage. The amount by which revenues exceeded normal orders per hour estimates how much business was retained. For a more accurate impact factor, calculate this as a percentage of normal orders per hour and subtract it from 100 percent.

As mentioned earlier, sales per hour do not reflect the value of customer loyalty. To reflect total lost sales more accurately, increase impact percentage to reflect the lifetime value of customers who permanently defect to competitors. Since determining this number requires a long history of data and assumes (often inaccurately) that the future will reflect the past, an educated guess must suffice.

This category covers more intangible outage costs. Did you incur late-delivery surcharges? Did you pay overtime to make up for lost productivity? Did missing a critical financial filing deadline result in penalties and an adverse affect on your stock price? Did lost customer goodwill erode your ongoing revenue stream? Did you need to plan and execute campaigns to explain and apologize for the outage?

Service costs are rarely zero. Downtime usually leads to a cascade of related costs. Work with your financial team to identify all such service costs incurred during or after a previous outage. Then, divide these total costs by the total number of hours the systems were down to determine the cost per hour.

Total Costs

Totaling all of these costs reasonably forecasts the loss you can expect from an hour of outage for a particular system. (Costs will vary depending on the nature of the application, so you must perform this calculation for each system.) To calculate expected annual cost, multiply this number by the number of expected annual hours of outage. When considering all factors, the potential loss from critical system failure may be a shock the first time you calculate it.

The purpose of this exercise is to determine an appropriate level of availability management investment. Loss from past outages, while a good predictor, is not what matters. The relevant number is the cost you can avoid in the future by implementing availability management today.

The Solution

Some solutions on the market are designed to assist with managing the planned and unplanned downtime businesses experience. Availability management solutions on the AS/400 market can ensure that businesses stay online as business demands. Availability


Service

management solutions enable organizations to deliver consistent, predictable access to applications and data across an enterprise.

Data availability ensures that critical data is available consistently throughout the enterprise so all users have the data resources and data integrity they need to perform business functions. This is a difficult challenge often impacted by disparate databases, multiple server environments, remote users, and the various "islands of information" in most business environments.

Application availability focuses not only on data availability but also on business- critical application availability, 24 hours a day, 365 days a year. It is impacted frequently by unplanned events such as server failure, fire, and flood and impacted more frequently by planned events such as maintenance and application upgrades.

More businesses are implementing availability management solutions to ensure that their users are protected from planned and unplanned downtime. HA is one component of an availability management solution. HA enables a clustering environment, which ensures data integrity and can provide a disaster recovery solution. Businesses invest in HA solutions to protect their enterprises from unplanned downtime.

The other side of an availability management solution is the possibility of continuous operations, which ensure that applications and data are always available to users, even when IT has to take a server/system down to work on it. Continuous- operations solutions also allow for transparent database reorganization, file updates, workload balancing, application performance management, and transparent hardware and operating system maintenance.

How Important Is Availability?

Consider these facts. In a 1999 International Data Corporation (IDC) survey, 98 percent of respondents stated that unscheduled system downtime significantly affected their business. Some 59 percent said that they could not conduct business or afford to have lost revenue during unscheduled downtime.

According to IBM Global Services Business Recovery Services, less than 7 percent of the more than 600,000 AS/400 business computing systems installed worldwide are protected from disaster with an effective backup/recovery plan.


The_High_Cost_of_Low_Availability06-00.png 375x309

Downtime

Company Type

5

Days

11% 18% 26% 33.3%

28% 33.5% 40% 45.5%

38% 42% 46% 49%

18% 25% 35% 40%

35.5% 44% 50% 54%

31.9% 40% 46% 52%

10

Days 20

Days 30

Days

SERVICE INSURANCE FINANCIAL MANUFACTURING UTILITIES HOSPITALS

Figure 1: This table of impact percentages shows that the longer and more frequently system outages occur, the more frustrated customers become.


BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: