Protection is irrefutably the central purpose of technology that facilitates computing system resiliency, and absolutely nothing offers more comfort in times of desperation than having a mirrored system sitting in a bunker with a live screen ready to take on production operations.
Over the course of two decades, HA has made a significant imprint on IT. Its role, however, has been rather narrowly defined in the minds of most people. Interestingly, over time, HA technology has been queued up and used for "outside the box" tasks, not just business continuation. This flexibility is largely made possible by the harmonic convergence of numerous technologies that HA impinges upon, including hardware, software, and network bandwidth.
In some cases, the architects of commercially available HA packages have created new solutions to specifically handle some of these off-mainstream tertiary tasks, but all of these spin-off systems use the tenets of HA at their core.
Evolving business requirements have pushed the need for uninterrupted system availability to levels that previously seemed unapproachable, given the complexity of computing technologies. System users—who now include customers as well as upstream and downstream supply chain partners—are notoriously impatient and unforgiving. And, in the age of SOA, where computers interact directly with each other, Web services need to be online 100 percent of the time.
Given this fact, one needs to consider all of the circumstances that can render a system inaccessible. Statistics show that users are more often inconvenienced by planned system downtime rather than hardware, software, or network failures. Other uses for HA resources have evolved to address these factors.
Despite experience and sound planning, the process of upgrading to a new software release level can be in inexact exercise. Unexpected problems can hurt short-term productivity and profitability. While the likelihood of encountering a catastrophic problem when rolling your system up to a new release level is relatively slim, all of the small nagging problems can add up. In some situations, when core business applications don't work, and returning to a previous software level is not possible because some data structures have been changed, users can languish for hours.
HA technology facilitates a smooth transition here by allowing a software upgrade to be evaluated in a "dry run" on the mirrored, backup environment before it is rolled out to the production environment. Any problems or conflicts can then be resolved without the risk of downtime or data loss in a live business environment. In fact, many businesses that use HA find that the ability to evaluate upgrades and test all manner of changes and modifications before rolling them out to the production environment is one of the key benefits of this technology.
Testing new software is a central function of software development that regrettably gets neglected. New code is rarely defect-free, and all too frequently, it is the user who finds problems. Time-to-market pressures coupled with the difficulty of containing the cost of new development are two factors that are placing greater pressure on developers to get new software off of the test bench and into production.
In days past, third-party software vendors issued new releases of their code on a relatively infrequent basis. After 10 years, some products had only advanced to version 3x. Now, vendors release upgrades much more frequently because of market pressure and interoperability issues. Since very few IT shops run major third-party applications that are entirely unmolested, most of us have work to do on our own modifications before we can load a third-party upgrade. Once initial programming is complete, these modifications need to be integrated with the vendor's package and tested.
System i HA tools can be used to supply excellent data for testing. First, test data must contain a representative sample of production data. A test database is typically much smaller than your production database but must be identical with respect to tables, indexes, and constraints. When test data accurately represents production data, you can be certain that a report will run the same way in both environments.
One of the more tedious and time-consuming aspects of maintaining the viability of test data is ensuring its freshness. Software testing changes the data content. As structural and data changes are made, test data loses its usefulness.
When HA technology is used to supply data for testing, data is always fresh, abundant, and relevant. It eliminates the need to continually extract test data, use stale data for testing, or, heaven forbid, test against a live production database.
This type of environment incorporates the use of one System i machine to serve as the primary production machine and another partitioned system to serve as the backup and test machine. A third, smaller system can be used as a dedicated development and test box if such resources are available.
Once the hardware, application software, and communications environment are stable, HA can be used to replicate for business resiliency between the production machine and the backup, and, for the purposes of testing, between the production machine and the development/test system. To do this, your HA solution must support "one-to-many" replication.
To minimize impact on the network, you can replicate to the development/test machine only data that is necessary to test the applications under development. A smaller volume of data speeds up queries and allows testers to move through their scripts more quickly.
Once an adequate amount and variety of data resides on the development/test machine, you can suspend replication and commence with your manual or automated QA/QC protocol.
Normally, you'll find your test data to be in bad shape once testing is completed. In ordinary cases, these data files would be deleted and replaced with fresh ones, but in instances where high availability is used, a resync can restore the integrity and freshness of the test data sets. If the HA product involved has self-healing capabilities, a resync is not necessary because this type of technology will compare production data files against those on the test machine and repair them automatically.
In either case, data files on the test machine will resync with the last transactions committed on the production machine, and more importantly, there is absolutely no downtime exposure because the replication to the backup machine is never hindered.
While HA systems can greatly simplify the extraction and maintenance of test data, they cannot make data safe for testing. Test data can contain sensitive information like social security numbers or credit card and bank account numbers. It can also directly reference a specific person's health status. On a daily basis, there are dozens of accounts in the news of how this type of information was mishandled or stolen. Sensitive data that is used for testing must be scrambled to protect the identities of people and companies yet still maintain referential integrity. Many tools on the market today can scramble account numbers while maintaining the data's viability in a test environment.
Upgrade, Migrate, and Consolidate
Despite rational man's tendency to lean toward denial, every System i shop must periodically upgrade its hardware. By any measure, this type of system maintenance takes time. Typically, a full system swap takes up to 24 hours even when everything goes as planned, and things rarely do, mainly because the specific knowledge needed to plan and execute a complicated migration project—and foretell its pitfalls—rarely exists.
HA can be used to substantially shrink the window of time needed for system upgrades and migrations and lessen the potential for disruption by synchronizing parallel systems during hardware upgrades. Users can work on one system while IT administrators perform tests, audits, and day-end processes on the new machine to verify system integrity prior to going live.
For this to work, you must first have a mirrored image of the production environment on another system. This second system can be the backup machine for your HA environment, or it can be one supplied on a temporary basis by a hardware or HA vendor. Production system users are switched from the primary production box to the backup system prior to taking the primary system off line. Obviously, replication must be suspended until the new machine comes online.
Once the primary production hardware swap has taken place and the application environment has been reestablished and tested, the database can be updated by using the HA tool to resync all changes to the new system. In this scenario users experience downtime only when they are switched from one machine to the other.
It's worth noting that all of these processes can take place during the day while people are banging away at their desktops and customer requests are being satisfied. When maintenance tasks are done during the day instead of at night, a full cadre of technicians are available to help troubleshoot problems, not to mention the business benefits of keeping revenue-producing operations proceeding at full pace during the upgrade.
Here is an example of how one large hospital in California recently used HA for a hardware upgrade. Florencio Alcocer, Senior System Engineer at Adventist Health in Roseville, California, calculated the amount of downtime users would face as they migrated from an AS/400 Model 730 to a System i Model 570 and realized that users could be offline for up to 20 hours. Because Adventist Health receives patients 24x7 and cannot withstand 20 hours of system downtime, Alcocer used high availability in the manner stated above to facilitate this migration. "We cut our downtime to a fraction of what it would have otherwise been and eliminated some staff scheduling problems."
Server Load Balancing
Server Load Balancing (SLB) is a way to share work between several computers and computer resources. The goal of SLB is to reach optimal resource utilization and accelerate response time. Since load balancing increases the use of interconnected resources, it places an additional burden on the network.
High availability tools can be used to balance a heavy system load between two or more System i servers to accommodate SLB. In fact HA and SLB go hand in hand because, by definition, they both offer redundant resources to satisfy varying computing requirements.
If an HA software solution is used to accommodate SLB, no dynamic network-based switching device is needed to move user loads from one server to another. Instead, user loads are statically balanced, meaning that the backup machine is used to handle specific tasks in a static manner. For example, you can assign interactive processes to the primary production machine while end-of-day batch processing, reports, or development tasks can be tasked to the backup machine. To accomplish this, replication must be bi-directional.
Boost Business Productivity, Profitability
A high availability solution can also uncover and exploit unused value in your business processes and throughout your organization. Because a high availability solution minimizes or completely eliminates planned downtime, it immediately boosts productivity, raises efficiency, and makes an organization more resilient and responsive.
Many organizations have unused opportunities to maximize uptime across the full range of the business and IT infrastructure, including front and back offices, go-to-market processes, partner and channel operations, product development, and information exchange and collaboration.
Start by identifying opportunities where currently unused time can be leveraged for higher productivity, revenue growth, profitability, competitive advantage, and information sharing. Ask IT staff and line managers to identify where planned downtime can interrupt or prevent operations and other business processes from achieving their goals.
It's also likely that planned downtime may be adding hidden costs to productivity and the bottom line across a variety of functions including go-to-market strategies, supply chain management, ERP applications, logistics, collaboration, after-market sales and service, channels, customer analysis, service-level agreements, and off-shoring and outsourcing operations.
More Than You Thought It Was
High availability solutions should be viewed as part of the vital infrastructure that supports all your operations as well as providing a safety net against unexpected IT downtime. In other words, rather than just helping you get out of trouble when things go wrong, a high availability solution can streamline and accelerate your technical, operational, and business processes in the best of times.
Bill Hammond directs Vision Solutions' worldwide product marketing efforts and is responsible for marketing strategy, product branding and messaging, and marketplace and competitive intelligence. He has over 15 years of experience in product marketing, product management, and product development roles in the software industry.