Backup and Recovery Strategies Update for IBM i

High Availability / Disaster Recovery
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

There are many ways to back up your system and your data. What's important is that you actually choose one and use it.


The most valuable component of any computer system isn't the hardware or software that runs the computer but, rather, the data that resides on the system. If a system failure or disaster occurs, you can replace the computer hardware and software that runs your business. Your company's data, however, is irreplaceable. For this reason, it's critical to have a good backup and recovery strategy. Companies go out of business when their data can't be recovered.


What should you be backing up? The simple answer to this question is that you should back up everything. A basic rule of backup and recovery is that if you don't save it, it doesn't get restored. However, you may have some non-critical data (e.g., test data) on your system that doesn't need to be restored and can be omitted from your backup.


When and how often do you need to back up? Ideally, saving your entire system every night is the simplest and safest backup strategy. This approach also gives you the simplest and safest strategy for recovery. Realistically, though, when and how you run your backup, as well as what you back up, depend on the size of your backup window—the amount of time your system can be unavailable to users while you perform a backup. To simplify recovery, you need to back up when your system is at a known point and your data isn't changing.


When you design a backup strategy, you need to balance the time it takes to save your data with the value of the data you might lose and the amount of time it may take to recover. Always keep your recovery strategy in mind as you design your backup strategy.


If your system is so critical to your business that you don't have a manageable backup window, you probably can't afford an unscheduled outage either. If this is your situation, you should seriously evaluate the availability options of the IBM i, including dual systems. 

Designing and Implementing a Backup Strategy

You should design your backup strategy based on the size of your backup window. At the same time you design your backup strategy, you should also design your recovery strategy to ensure that your backup strategy meets your system recovery needs. The final step in designing a backup strategy is to test a full system recovery. This is the only way to verify that you've designed a good backup strategy that will meet your system recovery needs. Your business may depend on your ability to recover your system. You should test your recovery strategy at your recovery services provider's location.


When designing your backup and recovery strategy, think of it as a puzzle: the fewer pieces you have in the puzzle, the more quickly you can put the pieces of the puzzle together. The fewer pieces needed in your backup strategy, the more quickly you can recover the pieces.


Your backup strategy will typically be one of three types:


  • Simple—You have a large backup window, such as an 8- to 12-hour block of time available daily with no system activity.
  • Medium—You have a medium backup window, such as a 4- to 6-hour block of time available daily with no system activity.
  • Complex—You have a short backup window, with little or no time of system inactivity.


A simple way to ensure you have a good backup of your system is to use the options provided on the Save menu. The Go Save command provides you with additional menus that make it very easy to back up your entire system or to split your entire system backup into two parts: system data and user data. In the following discussion of backup strategies, the menu options I refer to are from the Save menu.


Implementing a Simple Backup Strategy


The simplest backup strategy is to save everything daily whenever there is no system activity. You can use Save menu option 21 (Entire system) to completely back up your system. You also have the option to back up your spooled file data. Simply change the Spooled file data parameter to *ALL to save the spooled file data for the output queues that are saved. You should also consider using this option to back up the entire system after installing a new release, applying Program Temporary Fixes (PTFs), or installing a new licensed program product. As an alternative, you can use Save menu option 22 (System data only) to save just the system data after applying PTFs or installing a new licensed program product.


Option 21 offers the significant advantage that you can schedule the backup to run unattended (with no operator intervention). Keep in mind that unattended save operations require you to have a tape device capable of holding all your data.


Even if you don't have enough time or enough tape-device capability to perform an unattended save using option 21, you can still implement a simple backup strategy:


  • Daily backup—Back up only user data that changes frequently.
  • Weekly backup—Back up the entire system.


A simple backup strategy may also involve Save menu option 23 (All user data). This option saves user data that can change frequently. You can also schedule option 23 to run without operator intervention.


If your system has a long period of inactivity on weekends, your backup strategy might look like this:


Friday night: Entire system (option 21)

Monday night: All user data (option 23)

Tuesday night: All user data (option 23)

Wednesday night: All user data (option 23)

Thursday night: All user data (option 23)

Friday night: Entire system (option 21)


Implementing a Medium Backup Strategy


You may not have a large enough backup window to implement a simple backup strategy. For example, you may have large batch jobs that take a long time to run at night or a considerable amount of data that takes a long time to back up. If this is your situation, you'll need to implement a backup and recovery strategy of medium complexity.


When developing a medium backup strategy, keep in mind that the more often your data changes, the more often you need to back it up. You'll therefore need to evaluate in detail how often your data changes.


Several methods are available to you in developing a medium backup strategy:


  • Saving changed objects
  • Journaling objects and saving the journal receivers
  • Saving groups of user libraries, folders, or directories


You can use one or a combination of these methods.


Saving Changed Objects


Several commands let you save only the data that has changed since your last save operation or since a particular date and time.


You can use the Save Changed Objects (SAVCHGOBJ) command to save only those objects that have changed since a library or group of libraries was last saved or since a particular date and time. This approach can be useful if you have a system environment in which program objects and data files exist in the same library. Typically, data files change very frequently, while program objects change infrequently. Using the SAVCHGOBJ command, you can save just the data files that have changed.


The Save Document Library Objects (SAVDLO) command lets you save documents and folders that have changed since the last save or since a particular date and time.


You can use the Save (SAV) command to save only those objects in directories that have changed since the last save or since a particular date or time.


You can also choose to save only your changed data, using a combination of the SAVCHGOBJ, SAVDLO, and SAV commands, if the batch workload on your system is heavier on specific days of the week. For example:


Day/Time             Batch Workload            Save Operation

Friday night         Light                     Entire system (option 21)

Monday night         Heavy                     Changed data only*

Tuesday night        Light                     All user data (option 23)

Wednesday night      Heavy                     Changed data only*

Thursday night       Heavy                     Changed data only*

Friday night         Light                     Entire system (option 21)

* Use a combination of the SAVCHGOBJ, SAVDLO, and SAV commands.


Journaling Objects and Saving the Journal Receivers


If your save operations take too long because your files are large, saving changed objects may not help in your system environment. For instance, if you have a file member with 100,000 records and one record changes, the SAVCHGOBJ command saves the entire file member. In this environment, journaling your database files and saving the journal receivers regularly may be a better solution. However, keep in mind that this approach will make your recovery more complex.


When you journal a database file, the system writes a copy of every changed record to a journal receiver. When you save a journal receiver, you're saving only the changed records in the file, not the entire file.


If you journal your database files and have a batch workload that varies, your backup strategy might look like this:


Day/Time             Batch Workload   Save Operation

Friday night         Light            Entire system (option 21)

Monday night         Heavy            Journal receivers only

Tuesday night        Light            All user data (option 23)

Wednesday night      Heavy            Journal receivers only

Thursday night       Heavy            Journal receivers only

Friday night         Light            Entire system (option 21)


To take full advantage of journaling protection, you should detach and save the journal receivers regularly. The frequency with which you save the journal receivers depends on the number of journaled changes that occur on your system. Saving the journal receivers several times during the day may be appropriate for your system environment.


The way in which you save journal receivers depends on whether they reside in a library with other objects. Depending on your environment, you'll use either the Save Library (SAVLIB) command or the Save Object (SAVOBJ) command. It's best to keep your journal receivers isolated from other objects so that your save and restore functions are simpler. Be aware that you must save a new member of a database file before you can apply journal entries to the file. If your applications regularly add new file members, you should consider using the SAVCHGOBJ strategy either by itself or in combination with journaling.


Saving Groups of User Libraries, Folders, or Directories


Many applications are set up with data files and program objects in different libraries. This design simplifies your backup and recovery procedures. Data files change frequently, and, on most systems, program objects change infrequently. If your system environment is set up like this, you may want to save only the libraries with data files on a daily basis. You can also save, on a daily basis, groups of folders and directories that change frequently.


Implementing a Complex Backup Strategy


If you have a very short backup window that requires a complex strategy for backup, you can use some of the same techniques described for a medium backup strategy, but with a greater level of detail. For example, you may need to save specific critical files at specific times of the day or week.


Several other methods are available to you in developing a complex backup strategy. You can use one or a combination of these methods:


  • Save system information in a non-restricted state
  • Save data concurrently using multiple tape devices
  • Save data in parallel using multiple tape devices
  • Use the save-while-active process


Before you use any of these methods, you must have a complete backup of your entire system.


Save System Information in a Non-Restricted State


The Save System Information (SAVSYSINF) command will perform a cumulative save of a subset of system data and objects saved by the Save System (SAVSYS) command without requiring the system to be in a restricted state.


The SAVSYSINF command is not to be considered a replacement for the SAVSYS command and is not to be used for system upgrades or migrations.


After a base SAVSYS is performed, the SAVSYSINF command saves the following:


  • System objects such as job descriptions, job queues, subsystem descriptions, and changed commands
  • System reply lists, service attributes, environment variables, system values required for system recovery, and network attributes
  • Operating system PTFs that are copied into *SERVICE. Use the Change Service Attributes (CHGSRVA) command to modify your service attributes to automatically copy the PTF save files to *SERVICE when loading PTFs.


For system recovery, the Licensed Internal Code (LIC) and operating system will be recovered from your SAVSYS media. You then use your SAVSYSINF media and the Restore System Information (RSTSYSINF) command to restore the saved changes to system objects and PTFs.


Saving Data Concurrently Using Multiple Tape Devices


You can reduce the amount of time your system is unavailable by performing save operations on more than one tape device at a time. For example, you can save libraries to one tape device, folders to another tape device, and directories to a third tape device. Or you can save different sets of libraries, objects, folders, or directories to different tape devices. Using Backup, Recovery and Media Services (BRMS), you can run multiple backup control groups concurrently to different tape devices. BRMS is IBM's comprehensive tool for managing backup, archive, and recovery operations for a single IBM i system or multiple systems at a site or across a network.


Saving Data in Parallel Using Multiple Tape Devices


A parallel save is intended for very large objects, libraries, or directories. With this method, the system "spreads" the data in the object, library, or directory across multiple tape devices. This function is implemented with BRMS.




The save-while-active (SWA) process can significantly reduce the amount of time your applications are not available and increases your user access to applications and data. With SWA, users can resume activity after the save processing reaches a synchronization checkpoint.


The simplest method to use the SWA feature is to prevent user access to applications and data until the SWA checkpoint is reached. Once the SWA checkpoint is reached, any exclusive locks are released and the users can resume their normal activity while the system continues to perform the save of the data. Especially with large files, the time to reach the SWA checkpoint is significantly less than the time to actually save the object. The time to reach the SWA checkpoint is dependent on the number of objects, not the size of the objects being checkpointed.


Starting with IBM i 6.1, the SWA function offers a single save-while-active checkpoint for multiple saves. The Start Save Synchronization (STRSAVSYNC) command ensures a single, consistent checkpoint for your library and IFS saves or even a single, consistent checkpoint for multiple concurrent library saves. If you choose to use SWA, make sure you understand the process and monitor for any synchronization checkpoints before making your objects available for use.

The Inner Workings of the Save Menu

The Save Menu contains many options for saving your data, but four are primary:


  • 20—Define save system and user data defaults
  • 21—Entire system
  • 22—System data only
  • 23—All user data


You can use these menu options to back up your system. Or, if your installation requires a more complex backup strategy, you can use IBM i save commands in a CL program or IBM's BRMS product to customize your backup.


To help you make your decision, as well as to provide skeleton code that you can use as a guideline for your own backup programs, let's look at some of the inner workings of these primary save options. For detailed instructions and a checklist on using these options, refer to Backing Up Your System in the IBM i and System i Information Center.


Entire System (Option 21)


Save menu Option 21 lets you perform a complete backup of all the data on your system, including an optional parameter to back up spooled files. Option 21 puts the system into a restricted state. This means no users can access your system while the backup is running. It's best to run this option overnight for a small system or during the weekend for a larger system.


Option 21 runs the QMNSAVE program. The following CL program extract represents the processing that option 21 performs:



CHGMSGQ MSGQ(QSYSOPR)                    +

        DLVRY(*BREAK or *NOTIFY)




SAV DEV('/QSYS.LIB/tape-device-name.DEVD') +

    OBJ(('/*')                           +

        ('/QSYS.LIB' *OMIT)              +

        ('/QDLS' *OMIT))                 +


STRSBS SBSD(controlling-subsystem)


Note: The SAV command omits the QSYS.LIB file system because the Save System (SAVSYS) command and the SAVLIB LIB(*NONSYS) command save QSYS.LIB. The SAV command also omits the QDLS file system because the SAVDLO command saves QDLS.


One of the nice enhancements with IBM i 6.1 is the capability to track Save Option 21 backup times. There is a new data area called QSRSAV21 in the QUSRSYS library. This new data area will track the last five full system Option 21 saves, showing each command step, the time completed, and the device used for the save. This will allow you to analyze where your backup time is being spent during your full system Option 21 saves.


System Data Only (Option 22)


Option 22 saves only your system data. It does not save any user data. You should run this option (or option 21) after applying PTFs or installing a new licensed program product. Like option 21, option 22 puts the system into a restricted state.


Option 22 runs the QSRSAVI program. The following program extract represents the processing that option 22 performs:



CHGMSGQ MSGQ(QSYSOPR)                    +

        DLVRY(*BREAK or *NOTIFY)



SAV DEV('/QSYS.LIB/tape-device-name.DEVD') +

    OBJ(('/QIBM/ProdData')               +

        ('/QOpenSys/QIBM/ProdData'))     +


STRSBS SBSD(controlling-subsystem)


All User Data (Option 23)


Option 23 saves all user data, including files, user-written programs, and all other user data on the system. This option also saves user profiles, security data, and configuration data. Like options 21 and 22, option 23 places the system in restricted state.


Option 23 runs the QSRSAVU program. The following program extract represents the  processing that option 23 performs:



CHGMSGQ MSGQ(QSYSOPR)                      +

        DLVRY(*BREAK or *NOTIFY)





SAV DEV('/QSYS.LIB/tape-device-name.DEVD')   +

    OBJ(('/*')                             +

        ('/QSYS.LIB' *OMIT)                +

        ('/QDLS' *OMIT)                    +

        ('/QIBM/ProdData' *OMIT)           +

        ('/QOpenSys/QIBM/ProdData' *OMIT)) +


STRSBS SBSD(controlling-subsystem)


Note: The SAV command omits the QSYS.LIB file system because the SAVSYS command, the Save Security Data (SAVSECDTA) command, and the Save Configuration (SAVCFG) command save QSYS.LIB. The SAV command also omits the QDLS file system because the SAVDLO command saves QDLS. In addition, the SAV command executed by option 23 omits the /QIBM and /QOpenSys/QIBM directories because these directories contain IBM-supplied objects.


Setting Save Option Defaults


When you save information using option 21, 22, or 23, you can specify default values for some of the commands used by the save process. You can use Save menu option 20 (Define save system and user data defaults) to change the default values displayed on this panel for menu options 21, 22, and 23. Changing the defaults simplifies the task of setting up your backups. To change the defaults, you must have *Change authority to both library QUSRSYS and the QSRDFLTS data area in QUSRSYS.


When you select option 20, the system displays the default parameter values for options 21, 22, and 23. The first time you use option 20, the system displays the IBM-supplied default parameter values. You can change any or all of the parameter values to meet your needs. For example, you can specify additional tape devices or change the message queue delivery default. The system saves the new default values in data area QSRDFLTS in library QUSRSYS for future use (the system creates QSRDFLTS only after you change the IBM-supplied default values).


Once you've defined new default values, you no longer need to worry about which, if any, options to change on subsequent backups. You can simply review the new default options and then press Enter to start the backup using the new default parameters.


If you have multiple, distributed systems with the same save parameters on each system, option 20 offers an additional benefit: you can simply define your default parameters using option 20 on one system and then save data area QSRDFLTS in library QUSRSYS, distribute the saved data area to the other systems, and restore it.

Backing Up Integrated Servers and Linux, AIX, or IBM i Guest Logical Partitions

Save Menu Option 21 or a combination of Save Menu Options 22 and 23 can also save everything you need to back up and recover any Integrated Servers implemented with the IBM i Integrated Server support option, or any Linux, AIX, or IBM i Guest Logical partitions on IBM i. Simply select the parameter to vary off your network servers, and the storage spaces for any Integrated Servers and any Linux, AIX, or IBM i Guest partitions will also be backed up.

Backup and Recovery of Spooled Files

Starting with V5R4, you can use native save and restore commands or menu options to back up and recover your spooled files. The best part of this new function is that in addition to saving the spooled file data, the following spooled file identity and attributes are preserved:


  • Spooled File Name
  • Spooled File Number
  • Creation Date/Time
  • Fully Qualified Job Name
  • Job System Name


It is very easy to save all of your spooled files if you are using the Go Save menu options. These Save menu options include Option 21 to Save the entire system, Option 22 to Save system data only, and Option 23 to Save all user data. Simply change the Spooled file data parameter to *ALL to save the spooled file data for the output queues that are saved. 


If you are using the Save Library (SAVLIB) or Save Object (SAVOBJ) commands, simply specify SPLFDTA(*ALL). For each output queue that is saved, all available spooled file data is saved. 


If you are using Backup Recovery and Media Services (BRMS) to save spooled file data and upgrade to V5R4 or higher, BRMS will automatically use the new native save function to save spooled files. This will provide you with a performance benefit over the previous method BRMS used to save spooled file data.


When restoring your spooled files using the Save menu options, the Spooled file data parameter uses the default of *NEW. If you do not wish to restore your save spooled files, simply change the Spooled file data parameter to *NONE. Likewise, on the Restore Library (RSTLIB) and Restore Object (RSTOBJ) commands, the Spooled file data parameter has the default of *NEW.


To ensure a complete backup and recovery strategy, consider making use of these parameters to save spooled file data.   

Backing Up IBM i on Power Blade

Saving and restoring IBM i running on a Power blade is similar to doing so on other Power servers. You have the choice to use either a standalone LTO4 SAS attached tape device or a fiber channel attached tape library with LTO4 tape devices. The tape devices are recognized on the IBM i LPAR and use the same naming conventions as with any other IBM i environment. This allows you to use either the native IBM i save commands or BRMS to implement your backup and recovery strategy.


For either backup option, the physical adapters connecting the tape devices are owned by the Virtual I/O Server (VIOS). A different type of I/O virtualization is used in each case, but they both result in the tape device being available on IBM i as if the device was physically attached to the IBM i LPAR. The standalone LTO4 SAS attached tape device uses virtual SCSI so that the tape device first becomes available in VIOS and is then assigned to IBM i using the Integrated Virtualization Manager (IVM).The fiber channel attached tape library uses N-Port ID Virtualization (NPIV) so that the tape device from the tape library is directly mapped to IBM i on the Storage Area Network (SAN) with VIOS managing the physical fiber channel adapter in a passthrough method.

Integrated Virtual Tape Support

Integrated Virtual Tape support with IBM i is technology that is entirely unique to IBM i. It is completely integrated into the operating system and makes use of the disk on the system. This technology is built upon the image catalog support provided for optical "virtual devices." The virtual tape environment behaves just as if there were real tape drives with physical tape media. It is a simulated tape environment consisting of the following:


  • Virtual tape drives (up to 35)
  • Virtual tape media (256 maximum per image catalog)
  • One or more image catalogs to hold virtual tapes (known as image catalog entries)


Virtual tape is used just as you use real physical tape media. The save and restore commands, along with the Display Tape (DSPTAP) and Check Tape (CHKTAP) commands, behave in the same manner with virtual tape as with physical tape. Because virtual tape resides on your system, you still need to duplicate your virtual tape saves to physical tape media for disaster recovery.


There are some key advantages with using virtual tape:


May be faster than saving to physical tape

  • Similar performance as save files
  • Best performance in separate ASP


Eliminates save file restrictions

  • One library per save file
  • SAVSYS not supported on save file
  • Parallel saves not supported on save file
  • One TB size limitation on save file


Eliminates tape/media error limitations

  • Saves will not end due to tape device or media errors
  • Once save-while-active checkpoint reached, can't restart due to tape device or media errors


Duplication to physical media

  • When tape devices available
  • At your convenience


On site and off site storage

  • Keep virtual volumes on system as needed
  • Keep duplicated volumes on site


If you are only looking for a performance advantage, you will probably not see a performance improvement compared to saving directly to TS1120/TS1130 or LTO3/LTO4/LTO5 tape technology. Along with the advantages of virtual tape, there are also some key items to take into consideration when deciding if virtual tape will be a good solution in your environment. The main consideration will be the amount of additional disk required to perform saves to virtual tape. For example, if you want to save 500 GB of data, you will need another 500 GB of additional disk plus about 3 percent more for the tape headers to perform the save to virtual tape.


Here are some additional considerations:


Disaster recovery

  • No install from SAVSYS virtual media
  • D-IPL install from physical media only 


Data compaction not supported


Data compression supported but

  • Very low compression
  • Significant impact on performance


It is a good idea to carefully test prior to implementing a virtual tape solution. 

Encrypting Your Backups

Due to regulatory compliance needs, many customers now have requirements to encrypt their backup tapes. IBM's BRMS product supports backup encryption starting with IBM i 6.1. For encryption support, you will need to install BRMS, including the Advanced feature of BRMS along with the Encrypted Backup Enablement feature, which is Option 44 of the operating system. With BRMS software encryption, you cannot encrypt the operating system, which includes the *SAVSYS, *SAVSYSINF, *SAVSECDTA, and *SAVCFG special values. You also cannot encrypt any libraries that begin with a "Q," such as the QBRM, QUSRBRM, QSYS2, QGPL, and QUSRSYS libraries. There is a performance overhead when performing software encryption. You should expect your saves to take twice as long.


The alternative to software encryption without the restrictions and without performance impacts is to use hardware encryption for your backups with a fiber attached LTO4/LTO5 or TS1120/TS1130 tape library. These tape devices, which provide library-managed encryption, must reside in a tape library.


In addition to an encryption-capable tape device, you need to have a Tivoli Key Lifecycle Manager (TKLM) server. The TKLM server runs on a variety of platforms and supports many IBM operating systems along with non-IBM operating systems.


For disaster recovery, you need to either run a TKLM server or be prepared to recover your TKLM server before recovering your IBM i encrypted backups. You also need to ensure that your disaster recovery site can either let you have your TKLM server on site or provide access to a server for you to recover your TKLM environment. Be sure to run your TKLM on a server or LPAR where none of the saves are encrypted. At your recovery site, you also need the same type of encryption-capable tape device (e.g., TS1120, TS1130, LTO4, LTO5 ) in a tape library. 

Backing Up the Hardware Management Console

Your Hardware Management Console (HMC) needs to be backed up separately from your IBM i system. Using the HMC, you can back up all important data, such as the following:


  • User-preference files
  • User information
  • HMC platform-configuration files
  • HMC log files
  • HMC updates through Install Corrective Service


The Backup function is used to save the HMC data stored on the HMC hard disk to the following:


  •   DVD media
  •   USB Flash Memory Device
  •   Remote system mounted to the HMC file system (such as NFS)
  •   Remote site through FTP


You should back up the HMC after you have made changes to the HMC or to the information associated with your logical partitions.


Note: Before data can be saved to removable media, the media must be formatted. To format media, click HMC Management > Format Media and follow the steps.


To back up the HMC, you must be a member of one of the following roles:


  • Super administrator
  • Operator
  • Service representative


To back up the HMC critical data, perform the following steps:


  1. In the Navigation area, click HMC Management.
  2. Select Back up HMC Data.
  3. Select an archive option. You can back up to media on the local system, back up to a mounted remote system, or send backup data to a remote site.
  4. Follow the instructions on the window to back up the data.


Recovery for Physical and Logical Files

A very key backup and recovery enhancement that was implemented with IBM i 6.1 is one that users have been waiting quite some time for and benefits any system recovery. When you have physical files and logical files in different libraries, the physical file needs to be restored first; otherwise, the logical file does not restore. A second restore to restore the logicals can be very painful and take a significant amount of your valuable recovery time. This recovery situation is now fixed with IBM i 6.1. If you're restoring your system with RESTORE  Menu Options 21, 22, or 23 or if you're using the RSTLIB command with *NONSYS, *IBM, or *ALLUSR, or if you're using BRMS to recover your system, there are no additional steps and neither is there additional time required to restore the missed logicals.

Testing Your Backup and Recovery Strategy

To complete your backup strategy implementation, remember that the most important piece is to test your system recovery strategy. You should at a minimum test your complete recovery strategy once a year.


as/400, os/400, iseries, system i, i5/os, ibm i, power systems, 6.1, 7.1, V7,