Understanding System Reference Codes (SRCs), their significance and their meanings, is critical in managing AS/400 operations and discovering potential problems that may have developed on your AS/400. It important to understand how SRCs are displayed by the system, and it is equally important to understand how to react and handle the situation they are referring to.
In this introduction to SRCs, Ill attempt to dispel the mystery of these codes and, hopefully, alleviate your fears if, or perhaps more aptly when, you encounter them!
What Is an SRC?
An SRC is a set of characters that identify an error or condition detected by the OS/400 operating system and will be displayed on the control panel (Function/Data display) of the system unit. The first four characters indicate the type of error, and the last four give additional important information. The first character or two can also help identify the group (sometimes called category) within which the code resides. For example, if you receive an SRC of 93xx xxxx (where x indicates any other letter or number), it could indicate a possible disk unit failure has occurred! Figure 1 is a helpful chart to keep handy; it will help you determine in which direction to look for a problem. (This information is also provided and outlined in much more detail in table 7 of IBMs AS/400 Basic System Operation, Administration, and Problem Handling V4R4.)
If an operating system or hardware failure occurs while the system is running normally, the SRC code will be sent to the QSYSOPR message queue. However, an SRC condition typically first rears its ugly head during an IPL, and will be displayed on the control panel (Function/Data display) of the system unit. (Later in this article is a brief discussion on how to use remote control panel APIs to access and react to SRCs.) Regardless of how an SRC code is communicated, it should be taken seriously. This could be the systems first cry for help! Ignoring the situation or failing to properly record the code and act on it will make recovery more difficult, especially if your IBM service representative needs to get involved.
What Has Changed?
Unless an SRC is related to hardware failure, most conditions that create SRC exceptions can be traced to one of the following conditions:
Power outage or fluctuation, which could create a hardware failure
Hardware configuration change
System software change
New program installation or change (IBM program products), which can sometimes expose your system to new conditions; always apply any related PTFs with a new product.
Note: To make sure licensed programs and products have been properly installed, use the Check Product Option (CHKPRDOPT) command after installing a new product. This will report differences in the structure of the installed product. For more information, type in this command on any command line and prompt with F4, then review the online help.
Recent PTF application
Always keep these possibilities in mind when diagnosing a problem. If help is required from IBM, make it aware of what has changed on your system. As I have indicated, do not ignore the SRC code. Record what is displayed so the IBM service representative can better understand the problem and know how to fix it.
Understanding the IPL Cycle
Other than disk failures, most SRC conditions usually occur during the IPL cycle. The IPL does a complete test of the hardware and configuration of your system. Load source is loaded from disk into memory during this process. Hard drives are powered up and recognized, and processors are started. An IPL is also the first process performed after new hardware is installed and configured using the service tools (DST), a new feature or card is added, or an abnormal power-down has occurred because of a power surge or outage.
Perhaps you just upgraded your operating system or applied a PTF or cumulative tape (cumulative PTF package). All of these conditions have created a change to your system and can therefore expose you to at least a potential problem that could generate an SRC.
While this article has implied that all SRCs indicate problems, it is important to mention that many SRC codes displayed on the panel of your system unit are simply informational. The next time you IPL your system, watch the control panel. Figure 2 lists the typical (normal) SRC codes that will be displayed as your system cycles through an IPL. (Your codes may vary somewhat, depending on the type of system unit you have.) Use this chart to benchmark the normal cycle times of each process, and keep those figures tucked away where you can easily retrieve them. Remember that these cycle times can change based on a number of situations, including the time that has lapsed since your last IPL, the installation of new hardware, or an abnormal power-down. It is always a good idea, however, to record a benchmark of the normal IPL cycle time of your system units. That way, you have something to compare with during subsequent IPLs.
If the IPL sign-on display (during an attended IPL) or normal sign-on screen (during an unattended IPL) does not appear after a reasonable amount of time, check the LCD panel of your system unit. You may see an (abnormal) SRC code displayed or even the System Attention light on your system control panel. This is the time to follow the instructions outlined in AS/400 Basic System Operation, Administration, and Problem Handling V4R4.
Reporting a Problem
Follow the instructions outlined in section 2 of the AS/400 Basic System Operation guide. If you encounter a situation that is confusing or not outlined clearly, do not hesitate to call an IBM customer service representative. After all, thats why you pay those big bucks for your hardware service contract! You can reach IBM Service by dialing 1-800-IBM-SERV. Your IBM service representative may ask you to fill out a Problem Summary Form, which also can be found in section 2 of the guide. Record as much information as you can while it is available. You may also be instructed to obtain a main storage dump and run the authorized program analysis report (APAR) procedure.
All of these processes will collect valuable information that may be needed by your IBM service representative to diagnose and fix your system problems.
Using Remote Control Panel APIs to Improve SRC Alerts
You may want to consider harnessing the power of AS/400 APIs to alert and react to SRC conditions. APIs allow your custom-written program to interface with the system control panel. The IBM AS/400 Information Center (http://publib.boulder.ibm.com/pubs/ html/as400/infocenter.html) has a wealth of information on this topic. Search for the topic remote control panel APIs. Basically, remote control panel APIs can be established by using a standard TCP/IP Sockets connection and implemented with any programming language that supports Sockets, such as Java, C++, Visual Basic, and even RPG IV. Look to APIs, such as GetSPCNSRC or GetSRCs, to retrieve these values.
Consider using APIs and Java to create a Java applet that will pop up an alert on the system operators PC whenever an abnormal SRC condition is detected. You could even pass this information to a Palm Pilot or other wireless device by using Wireless Access Protocol (WAP). You might even want to take the information obtained from these APIs and display it in an Operations Navigator panel. (For more information on creating Operations Navigator plug-ins, see How to Build OpsNav Plug-ins Using Java in the July 2000 issue of MC.)
Prepare for a Disaster!
SRCs should be taken seriously. Get to know your individual system unit, IPL process, and system control panel, including the Function/Data display. Review, in detail, AS/400 Basic System Operation, Administration, and Problem Handling V4R4 before you need it. Understand what resources are available to you and get familiar with the recovery path you need to take in any situation. Educate yourself now before a disaster hits!
REFERENCES AND RELATED MATERIALS
AS/400 Basic System Operation, Administration, and Problem Handling V4R4 (SC41- 5206-03, CD ROM QB3AGO03)
IBM AS/400 Information Center: http://publib.boulder.ibm.com/pubs/html/as400/infocenter.html
SRC Series General Description or Category
0000 AABB Attempted a timed, remote, or automatic IPL with the 0000 AACC system in secure or manual mode
0000 AADD You attempted a manual IPL with the system in secure or auto mode
1xxx D101 Battery power unit failed or battery power unit test failed 1xxx D102
63xx xxxx Tape unit failed
93xx xxxx A disk unit failed
A6xx 0277 A compression disk unit can not complete an operation A900 2000 The IPL completed normally
BOxx xxxx Failure detected by communication Licensed Internal
Code (LIC) B6xx xxxx Not enough auxiliary storage (disk storage) B9xx xxxx OS/400 IPL Failure
Cxxx xxxx Occuring during an IPL. Will change frequently during a normal IPL
D1xx xxxx Diagnostic Status. Normal while system main storage is being saved to disk
D6xx xxxx Normal indication while system is being powered-down B2xx xxxx IPL Load Failure
D2xx xxxx Normal indication during powering-down of the system Note: x can be any number from 0 to 9 or any letter from A to F.
SRC Series Activity
C1xx Bxxx Input/output processor testing C1xx 1xxx Input/output processor loading C3xx 3xxx System processor testing C1xx 2xxx System processor loading C1xx 202E Testing system main storage C1xx D009 System power hardware ready C1xx 2034 IPL control passed to system processor C6xx 4xxx Testing system configuration C6xx 4xxx IPLing OS/400
C6xx 4260 System disk data recovery
C9xx 2xxx Operating system starting
C900 29C0 Operating system (recovery) Note: x can be any number from 0 to 9 or any letter from A to F.
Figure 1: Use this table to help determine system SRC codes.
Figure 2: Here are the typical (normal) SRC codes that will be displayed as your system cycles through an IPL.