TechTip: Monitoring Your System i5 with Nagios

IBM i (OS/400, i5/OS)
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Easily integrate host and service monitoring into your IBM i servers and networks with the open-source utility Nagios.

 

Shortly after reading "Cheap and Effective Network Monitoring" last year in the Linux Letter column, I immediately began to acquire skills working with the well-known open-source monitoring tool discussed in that article. Since Barry Kline's article goes into great detail on what Nagios actually is and what it can do for you, I won't elaborate beyond the fact that Nagios is a host-, service-, and network-monitoring utility maintained by Ethan Galstad.

A very powerful feature of Nagios lies in its ability to be expanded on by the community, meaning plug-ins are easily developed. It already comes complete with a lot of plug-ins to check just about anything and everything you can imagine, but if that's not enough, you can seek further options at NagiosExchange. This is where I found the Nagios AS400 Plugin.

What Does Nagios AS400 Require?

The plug-in has only three basic requirements—the first obviously being a working installation of the Nagios utility.

Second, you will need to create a user on your AS/400 system. This user needs to be only a generic user with very restrictive rights. Since the plug-in communicates with the AS/400 across Telnet, names and passwords could be easily sniffed on an unsecured LAN. The author's pre-installation notes suggest that the user have access to WRKSYSSTS, WRKOUTQ, WRKACTJOB, DSPJOB, DSPSBSD, and DSPMSG.

The final requirement is to have a working version of Java installed on your system, since it's written in this language. Java is installed on virtually every desktop these days, Linux and Windows alike, so this shouldn't be much of a task for you. If you don't have Java on your Nagios installation, you can easily download and install it from Sun. I currently run Java build 1.5.0_09 on my instance of Nagios.

What Can Nagios AS400 Do for You?

You can check just about anything you'd like to on the AS/400. I'm currently checking the host's status with a ping service, plus keeping tabs on an additional 27 services. These services range from CPU load, DB load, disk space, total and active jobs, to various subsystems and printer queues. I've blurred the names of hosts that include my network naming scheme to protect the identities of the innocent.

http://www.mcpressonline.com/articles/images/2002/nagios_as400V4--02160700.png

Figure 1: This is the Nagios service status display map. (Click image to enlarge.)

As you can see, you're able to gather a lot of information about the various services running on your AS/400. This in turn means you no longer have to worry about whether something fails, because you'll be notified according to your Nagios settings if something goes offline.

Installing Nagios AS400

Before I continue, I'd like to briefly mention a few things about Nagios. This utility has a very steep learning curve, and it takes quite a bit of time to understand, configure, and become comfortable with. Fortunately, the Nagios Web site provides you with more than ample documentation, ranging from theory of operation to OS-specific instructions. The site also hosts user-contributed documentation covering many aspects of maintenance and installations. Furthermore, Nagios has built-in documentation immediately following installation in case you need help with even the simplest questions.

For those of you who already have a working Nagios process up and running (including Java), installing the Nagios AS400 plug-in is a snap. Download a copy of the plug-in from SourceForge and unpack the zipped file, in which you'll find quite a few informative files, all of which are clearly marked. They've even included example configuration files that Nagios will use later on.

I'll demonstrate commands from a RHEL-based distribution (CentOS). Also, paths within Nagios will be relative to the distribution OS type. Substitute the tar and other commands with your favorite method for navigating a Linux command line.

tar -xzvf as400NagiosPlugin-018.tgz
cd as400NagiosPlugin
mkdir /usr/lib/nagios/plugins/check_as400
cp *.class /usr/lib/nagios/plugins/check_as400/
cp check_as400 /usr/lib/nagios/plugins/check_as400/

 

Next, you'll need to create a hidden file that contains the user name and password of the AS/400 user. Substitute vim with your favorite text editor.

cd /usr/lib/nagios/plugins/check_as400/
vim .as400

 

Add the following lines into the file and save before continuing.

USER=as400_account
PASS=as400_password

 

It's very important to change the permissions on the directory and specifically on the hidden .as400 file. By doing so, you ensure that only the nagios user on the box will be able to access this file. Typically, you should have only two users on your nagios box: root and nagios.

chown -R nagios.nagios /usr/lib/nagios/plugins/check_as400/
chmod 700 /usr/lib/nagios/plugins/check_as400/.as400

 

Lastly, you'll need to edit the check_as400 executable file to reflect the paths on your system. Depending on where you installed Nagios, Java, and the plug-in, your paths may vary. This file is executed by the Nagios user and then parses the .as400 file for the proper user name and password.

vim /usr/lib/nagios/plugins/check_as400/check_as400

USER=`cat /usr/lib/nagios/plugins/check_as400/.as400 | 
grep -e USER | cut -d = -f 2`

PASS=`cat /usr/lib/nagios/plugins/check_as400.as400 |
grep -e PASS | cut -d = -f 2`

/usr/java/sdk/bin -cp /usr/lib/nagios/plugins/check_as400 
check_as400 -u $USER -p $PASS $*

Notice that those are backticks ( ` ) above, not quotes.

That concludes installation of the plug-in itself. Next, you need to set up the check commands and services in Nagios.

Preparing Nagios Configurations

The last thing to do is specify the check command and service parameters within the Nagios configuration files. I'm not going to explain setup for all 27 of my services, but the first you have to start off with is the basic LOGIN check. If you can't log in, then you definitely won't be able to process any other checks. My Nagios configurations are located in /etc/nagios. Open up checkcommands.cfg and add the following.

# 'as400_login' check LOGIN status
define command{
        command_name as400_login
        command_line $USER1$/check_as400/check_as400 -H $HOSTADDRESS$ -v LOGIN
}

Now, add the login to the Nagios service file (services.cfg).

# AS400 LOGIN
define service{
use generic-service
name as400-login
service_description LOGIN
notification_interval 0
check_command as400_login
host_name myhost
      }

That's it, with the exception of verifying your Nagios configurations.

The last important piece of information I'll mention is the use of service dependencies. Located in the directory where you unpacked the zipped file, you'll find examples. Service dependencies allow you to specify any or all of your other checks to be dependent on another service. If you configure all of your service checks to notify you via email or pager and you haven't set up dependencies toward the login service, then you'll receive alerts for every service if a login check fails on your system. This can turn out to be a huge nuisance.

I'm a Nagios Addict!

Since discovering Nagios awhile back, I've started to track and watch everything on my network to the extreme. You can officially call me a monitoring junkie; at least that's what the folks in my IT department call me. Nagios watches almost 250 services across about 100 hosts for my company, so why not include AS/400 services as well? It's always nice to be notified of a problem sooner rather than later.

BLOG COMMENTS POWERED BY DISQUS