High-availability solutions for the AS/400 are more than just disaster recovery solutions: They also provide the necessary tools and interfaces to accommodate planned downtime. Planned downtime represents more than 90 percent of all outages in organizations and is an integral part of most businesses. Planned outages occur during activities that require businesses to lock the primary AS/400 to perform database reorganizations, execute tape backups, generate reports, and perform hardware or software upgrades. Typically, these day-to-day activities require that the primary system be unavailable for hours or even days at a time. Due to the effects of globalization, consolidation of businesses, and the Internet, firms simply can no longer afford extended periods of downtime and must therefore become highly available to accommodate the new 24/7 business economy.
Historically, businesses have carefully planned their downtime around windows of low systems activity. While this may have achieved acceptably low user impact in the past, the Internet economy has had the effect of shrinking downtime windows to a point where previous levels are now unacceptable. To further complicate the matter, mergers and acquisitions have increased the sheer magnitude of data that companies must process, resulting in even longer periods of planned downtime.
Thankfully, there is a solution. You can employ various advanced techniques with AS/400 high-availability software to help facilitate crucial business activities using quiesce point user-defined programsalso known in some high-availability software solutions as checkpoint user exits.
What Is a Quiesce Point?
In order to facilitate processing, reporting, and other business functions on the recovery system, most businesses must strategically specify the exact instance when a process is to occur. For example, to calculate invoices for a billing run at the end of every month, a wireless phone service company may require report generation precisely after the last transaction that occurred before the stroke of midnight. The specification of when to execute the billing run is made on the primary system. The actual process, however, occurs on the recovery system to minimize resource utilization on the primary system. This
requires the recovery system to execute its corresponding process at exactly the same point in time, from a business perspective, as the primary system. This synchronization point is known as a quiesce point. A quiesce point can have many names, but the concept is the same regardless of what it is called.
What Are Quiesce Point User Exits?
Quiesce point user exits are employed in one form or another in all AS/400 high-availability offerings. These user exits allow user-defined programs and routines to execute at quiesce points so the administrator can execute user-defined functions at specific points in the transaction life cycle of the database. Quiesce point user exits ensure that all required transactions are present on the recovery system before an application such as a billing run is executed. In essence, quiesce point user exits can enable the recovery system to execute the billing run when it reaches the quiesce point. These checkpoints can therefore be used in a variety of situations when you must execute programs at precise points in business time.
How Do Quiesce Point User Exits Work?
Quiesce point user exits are created on the source system using a specialized command provided by your AS/400 high-availability solution. Depending on the solution youve chosen, the actual implementation may be different, but the results should be the same.
Basically, commands to create a quiesce point immediately place checkpoint entries in journals so users can define synchronization points when journal entries are scraped, received, and applied in the source and target environments. You can then invoke a user exit program when group jobs for a target arrive at each checkpoint entry.
Synchronization is achieved at a quiesce point once a checkpoint entry is defined in the source environment and reaches the target environment. You can then call a user exit program to perform some operation with the user-defined data.
It is important to note that a group job for a target will process transactions until it reaches the checkpoint entry that is generated by this command. At this point, the job waits for other specified group jobs to reach this checkpoint entry. Synchronization is achieved when all active group jobs reach the checkpoint entry. This means that one or more jobs will wait at the checkpoint entry until all group jobs reach the quiesce point.
By using quiesce point user exits, AS/400 high-availability solution customers can reduce or eliminate planned downtime because some operations that were previously executed on the production system can now be administered by the recovery system. In doing so, more of the production systems resources are free to service customers or employees while the recovery system can perform back-office tasks including active database reorganizations, remote tape backups, and workload distribution.
Active Database Reorganizations
Most organizations without a high-availability solution will periodically have to perform database reorganizations to remove deleted records and reclaim storage space. Other organizations have batch processing programs that actually update the databases on the recovery system and therefore need to promote the database back onto the primary system. This usually entails a significant amount of downtime that you must schedule during periods of low systems activity. Again, shrinking downtime windows and 24/7 access requirements make planned downtime activities such as database reorganizations much more difficult in todays business environment.
Using quiesce point user exit programs, some AS/400 high-availability offerings can achieve database reorganizations while maintaining a live production database. This technique involves several steps using a high-availability product:
1. As shown in Figure 1, a virtual copy of the database is created at a quiesce point on both systems using a user exit.
2. Database reorganization is performed on the virtual database copy on the primary AS/400 and recovery systems. During this time, the primary system is live and continues mirroring data and objects to the recovery system.
3. As Figure 2 shows, after the reorganization is complete, the transactions journalled during the reorganization are applied to the newly-reorganized virtual databases using local loopback replication.
4. When the virtual databases are fully applied, the production database must be locked temporarily while the virtual and production databases file names are switched.
5. As Figure 3 (page 55) shows, mirroring resumes on the newly reorganized database that was previously the virtual copy.
During the active reorganization, the production database remains fully operational and accessible by users. The database must be locked out only for a brief moment during the switching or renaming sequence. This saves hours, if not days, of downtime and opportunity costs.
Remote Tape Backups
Organizations that have not adopted formal high-availability solutions most often back critical data to tape. The more mission-critical the data, the more often you must back it up. Typically, businesses that perform tape backups plan the execution of the backup process during periods of low systems activity because the production databases must be locked during the backup procedure. If the need to restore these backups should ever arise, you must lock the production system againoften for dayswhile the restoration process is occurring.
Most businesses that employ an AS/400 high-availability solution still perform tape backups for an extra level of protection and redundancy of data. You can perform these tape backups on the recovery system, thereby freeing resources on the primary system for production tasks and avoiding the expense of transporting the tapes to secure geographical areas.
In most businesses, tape backup periods occur at specific business quiesce points. A brokerage house may need to back up its accounts after the last trade at exactly 4:00
p.m., but also before after-hours trading begins. A user exit can execute at that particular point on the backup system as soon as it reaches that same quiesce point as was set up on the primary system. All AS/400 high-availability solutions offer a data store of some kind on the recovery system where transactions are queued before they are applied to the recovery systems database. This is usually called the staging store, and it allows transactions to accumulate while the database apply occurs. The staging store also has the unique feature of allowing the administrator to halt the apply process, while still offering full mirroring protection. When the apply process halts, the transactions sent from the primary system accumulate in the staging store, and the database can safely be locked by the tape backup processes. After the tape backup is complete, the apply process starts again, and the staging store is drained of all the transactions that accumulated during the tape backup procedure.
Depending on the performance of the high-availability solution youre using as well as the number of transactions accumulated during the tape backup, the time it takes to apply the staged transactions varies. In many business situations, millions of transactions may have accumulated in the staging store during the backup process. If the high-availability solution features high-performance cache technology, it can accelerate the staging store
draining process to a point where the transactions can be applied to the recovery database in a much shorter time.
Using a quiesce point user exit, the whole remote backup procedure can be automated, ensuring that the backup runs on the database at the exact point in business time that the user specifies on the source systemthe quiesce point.
Because reporting takes up resources from the primary system, organizations without high- availability implementations typically do their billing runs and statistical analyses during off-peak times. High-availability solutions enable the recovery system to take on this activity, thereby relieving the primary system and distributing the workload between two or more systems. Additionally, you can accomplish read-only queries of the database on the recovery system, allowing the primary system to accommodate higher priority queries or transactions that require inserts, updates, or deletes to the database. If the database administrator needs to run automated reporting procedures that perform read-only queries to the database, user exits let him run them at specified quiesce points. By allowing the recovery system to take away some of the workload from the primary system, the user sees no impact on primary system performance during the reporting process.
Avoid the Time Warp
In the new economy, data is a priceless commodity. It can be created; it can be collected. It can be distributed to customers; it can be utilized to understand customers and market trends. It provides businesses with a look into the inner workings of a company. Used properly, data can help generate a competitive advantage. But like many commodities, its availability cannot always be guaranteed.
Data backup and recovery solutions are necessities for organizations wishing to survive in todays global market. Many of todays businesses employ high-availability solutions based on this understanding, yet some still seem stuck in a time warp when it comes to leveraging the full power of their high-availability software. Why accept burdensome, lengthy delays to data availability when quiesce point user exits can eliminate or significantly reduce planned downtime activities? If you are making an investment in high availability, you should ensure that your business utilizes its solutions to maximum effect.
The flexibility of high-availability solutions that use quiesce point user exits gives organizations another reason to focus less on the technology driving their mission-critical data and more on the data itself. By staying informed of current trends in high-availability solutions, businesses can avoid the pitfalls of once-cumbersome downtime, gain a competitive edge in their industry, and remain open for business regardless of the time of day.
REFERENCES AND RELATED MATERIALS
Object Persistence in Object-oriented Applications, V. Srinivasan and D.T. Chang, IBM Systems Journal, 1997 (www.research.ibm.com/journal/sj/361/ srinivasan.html)
VIRTUAL COPY CREATION
A1 A2 B1 B2
Primary System A Recovery System B
Real-time mirroring from A1 to B1 continues
Figure 1: Active database reorganizations using high-availability software involve a multistep process.
APPLY TRANSACTIONS RECORDED DURING REORGANIZATION
A1 A2 B2
Primary System A Recovery System B
System A remains live and continues to mirror data/objects from A1 to B1
Figure 2: Apply transactions back to the primary system using local loopback for high availability.
RENAME, START MIRRORING REORGANIZED DATABASES
Primary System A Recovery System B
Mirroring of the reorganized databases resumes after file swap completed
A1 B2 B1
Figure 3: Mirroring begins after all other processes are completed.