Linux – How to configure Nagios Failover with NSCA & Crontab

>This Nagios Failover Configuration monitoring tools configured here to monitor the services on every 5 seconds and send out the alert based on service i.e. ssh, http etc.
NSCA Installation
Prerequisites
-Nagios should be previously installed and configured -External commands should be enabled and configured for Nagios previously -Master Nagios server and slave Nagios server should be identical in Nagios terms. Whenever you made any changes on Master Nagios server, Kindly sync with Slave Nagios server.

Getting the source

The first step would be to get the source code for NSCA. You can get this from any Sourceforge mirror. I will list the mirror that I use but you can find this by searching the Sourceforge website. http://prdownloads.sourceforge.net/nagios/nsca-version-number.tar.gz. You can download this either using an internet browser, or if you are using a machine without a GUI or test based browser, you can retrieve the file using the wget command: [root@localhost] wget http://prdownloads.sourceforge.net/nagios/nsca-version-number.tar.gz

Unpacking

After you have downloaded the compressed source, you need to unpack it to make it useable. Move the file to a working directory, perhaps your home directory: [root@localhost] tar xzf nsca-version-number.tar.gz The xzf are options for unpacking gzip files. x stands for extract, z for gzip type, and f to specify the filename.

Creating the binaries

To create the binaries for the NSCA add-on you is simple assuming you already have a C compiler such as gcc or preferably g++ installed. First run the configure script located in the base directory $DOWNLOADPATH$/nsca-2.6 with no arguments passed. After the script ends, assuming the computer met all requirements and finished without error, you can then compile with make.
[root@localhost]./configure
[root@localhost] make all

This should proceed to create two binaries and some daemon scripts. The most important of these for us is the nsca binary.

NSCA Installing

After compiling, we will put the compiled binary and the sample nsca config file in their respective nagios locations. This is not necessary, as it can run from any location as long as the config file is correct, but for the sake of clarity we will put the nsca binary in nagios’ bin directory and the config file in nagios’ etc directory.
[root@localhost] cp $DOWNLOADPATH$/nsca-version-number/src/ncsa $NAGIOSHOME$/bin
[root@localhost] cp $DOWNLOADPATH$/nsca-version-number/sample_config/nsca.cfg
$NAGIOSHOME$/etc

Configuring NSCA

The next step would be to configure the NSCA daemon. The config file is actually very selfexplanatory.

And if you have followed the documentation on the Nagios website to setup Nagios and external commands, the paths should all be right. Assuming you want all the default options, there is no configuration necessary here. The only options that I set are the server address and debug options.

The server address option lets you specific an IP to bind to. This is used when there is more than one network interface card. It allows the daemon to determine which interface it should monitor by choosing the IP address on that network segment. To do this, uncomment the following line and enter that interface’s IP address on the network segment you wish to monitor. server_address=192.168.1.z # My local IP address If you have only one network interface, then this is not necessary. Second, the debug option is useful for logs. The NSCA daemon writes it logs to the standard syslog facility, so you can usually find messages in /var/log/messages. Enabling the “debug” option in the NSCA daemon config file, will cause more verbose information to be written to the logs. This is especially useful to see if the packets are being received at all. But I would enable it in any case simply to have a log of what actually comes through even if it is not interpreted by Nagios. To enable debug, change the 0 to 1 in the following line:

debug=1

Finally, we need to set the permissions for the cfg file. After saving and closing the nsca.cfg, set the owner and group to whatever you are running user you are running nsca as. Second, add read permissions for the rest of the group.
Once you are through the testing phase, it is highly recommended that you use a password with NSCA. Remember you must enter the same password in BOTH the nsca.cfg as well as your client (most likely send_nsca.cfg).
[root@localhost] chown nagios.nagcmd /usr/local/nagios/etc/nsca.cfg
[root@localhost] chmod g+r /usr/local/nagios/etc/nsca.cfg

Running NSCA

The next step would be to start up NSCA. Keep in mind that at this point, although it will be passing information to Nagios, Nagios doesn’t yet have any service to process this information, so it will simply throw out the data as irrelevant. This is where having debug information show up in our syslog is useful. So we will call the executable nsca and provide it with the location of its conf file.
[root@localhost] /usr/local/nagios/bin/nsca –c /usr/local/nagios/etc/nsca.cfg
NSCA should now be running. Now to test it, we have two options. First of all, I like to port scan, so I scan the machine to see if the port specified (if left as default 5667) is open. This indicates that the program is in fact running and ready to receive information. If the port isn’t found open then you may have some other issue such as a firewall (iptables?) blocking it. After determining that NSCA is running and accesable, we can try to send it some data. We can try from the same machine, or from another host using the send_nsca binary that was compiled at the same time we compiled nsca. There are also plenty of third party software titles that incorporate send_nsca that you can use. I’ll show an example using send_nsca from the local machine. Assuming you haven’t yet put any password on the nsca host yet, we don’t need to configure anything for send_nsca. Send_NSCA by default uses tab delimited format, since often we cannot enter tabs if in a GUI command prompt, the work-around is to create a file containing our packet to send and pipe it to send_nsca. The format for a service check packet using NSCA is [tab][tab][tab][newline]. So create a normal text file named test with the following: localhost TestMessage 0 This is a test message. [return] Save the file and then run send_nsca.
[root@localhost] $DOWNLOADPATH$/nsca-version-number/src/send_nsca localhost –c
$DOWNLOADPATH$/nsca-version-number/sample-config/send_nsca.cfg
Host Name: ‘localhost’, Service Description: ‘TestMessage’, Return Code: ‘0’, Output: ‘This is
a test message.’
May 23 15:46:49 localhost nsca[1731]: [ID 862360 daemon.error] End of connection…

Implementation Part

Scenario.
We require two identical configuration nagios servers where one will act as a master nagios server i.e 192.168.1.x and Second will overtake if Master nagios server has problem

Nagios servers.

Master Nagios Server – 192.168.1.x
Slave Nagios Server – 192.168.1.y

We have schedule a script called ‘Master-nagios-monitoring.sh’ with the following content
#!/bin/bash
#/root/scripts/Master-nagios-monitoring.sh > /dev/null
NAGIOS_SERVER_IP=192.168.1.x
MY_HOSTNAME=192.168.1.y
SEND_NSCA=/usr/local/nagios/bin/send_nsca
SEND_NSCA_CFG=/usr/local/nagios/etc/send_nsca.cfg
return_code=$?
/usr/bin/printf “%st%st%st%sn” “$MY_HOSTNAME” “$2” “$return_code” “$4” | $SEND_NSCA -H $NAGIOS_SERVER_IP -c $SEND_NSCA_CFG
RETVAL=`echo $?`
if [ $RETVAL -ne 0 ]
then

/etc/init.d/nagios stop
/etc/init.d/nagios start
else
/etc/init.d/nagios stop
fi

on 192.168.1.y to run in every minute based on ncsa query to 192.168.1.x and check
if Master nagios server is running properly.

If 192.168.1.x Nagios is not running, it will start Nagios service on 192.168.1.y and as soon as Nagios resumes service on 192.168.1.x, it will shutdown 192.168.1.y Nagios on the basis of crontab and start monitoring 192.168.1.x Nagios again.

Crontab entry.

*/1 * * * * /root/scripts/Master-nagios-monitoring.sh > /dev/null > 2>&1

192.168.1.x Changes

Note-
1) you need to add the following entry in 192.168.1.x start section
/usr/local/nagios/bin/nsca -c /usr/local/nagios/etc/nsca.cfg ( see below in script file)
2) you need to add the following entry in 192.168.1.x stop section
/bin/ps -aef | grep -i nagios | grep -v grep | awk ‘{print $2}’ | xargs kill -9

/etc/init.d/nagios Entry

#!/bin/sh
#
# chkconfig: 345 99 01
# description: Nagios network monitor
#
# File : nagios
#
# Author : Jorge Sanchez Aymar (jsanchez@lanchile.cl)
#
# Changelog :
#
# 1999-07-09 Karl DeBisschop
# – setup for autoconf
# – add reload function
# 1999-08-06 Ethan Galstad
# – Added configuration info for use with RedHat’s chkconfig tool
# per Fran Boon’s suggestion
# 1999-08-13 Jim Popovitch
# – added variable for nagios/var directory
# – cd into nagios/var directory before creating tmp files on startup
# 1999-08-16 Ethan Galstad
# – Added test for rc.d directory as suggested by Karl DeBisschop
# 2000-07-23 Karl DeBisschop
# – Clean out redhat macros and other dependencies
#
# Description: Starts and stops the Nagios monitor
# used to provide network services status.
#

status_nagios ()
{

if test ! -f $NagiosRun; then
echo “No lock file found in $NagiosRun”
return 1
fi

NagiosPID=`head -n 1 $NagiosRun`
if test -x $NagiosCGI/daemonchk.cgi; then
if $NagiosCGI/daemonchk.cgi -l $NagiosRun; then
return 0
else
return 1
fi
else
if ps -p $NagiosPID; then
return 0
else
return 1
fi
fi

return 1
}

killproc_nagios ()
{

if test ! -f $NagiosRun; then
echo “No lock file found in $NagiosRun”
return 1
fi

NagiosPID=`head -n 1 $NagiosRun`
kill $2 $NagiosPID
}

# Source function library
# Solaris doesn’t have an rc.d directory, so do a test first
if [ -f /etc/rc.d/init.d/functions ]; then
. /etc/rc.d/init.d/functions
elif [ -f /etc/init.d/functions ]; then
. /etc/init.d/functions
fi

prefix=/usr/local/nagios
exec_prefix=${prefix}
NagiosBin=${exec_prefix}/bin/nagios
NagiosCfg=${prefix}/etc/nagios.cfg
NagiosLog=${prefix}/var/status.log
NagiosTmp=${prefix}/var/nagios.tmp
NagiosSav=${prefix}/var/status.sav
NagiosCmd=${prefix}/var/rw/nagios.cmd
NagiosVar=${prefix}/var
NagiosRun=${prefix}/var/nagios.lock
NagiosLckDir=/var/lock/subsys
NagiosLckFile=nagios
NagiosCGI=${exec_prefix}/sbin
Nagios=nagios

# Check that nagios exists.
test -f $NagiosBin || exit 0

# Check that nagios.cfg exists.
test -f $NagiosCfg || exit 0

# See how we were called.
case “$1” in

start)
echo “Starting network monitor: nagios”
su -l $Nagios -c “touch $NagiosVar/nagios.log $NagiosSav”
rm -f $NagiosCmd
$NagiosBin -d $NagiosCfg

/usr/local/nagios/bin/nsca -c /usr/local/nagios/etc/nsca.cfg

if [ -d $NagiosLckDir ]; then touch $NagiosLckDir/$NagiosLckFile; fi
sleep 1
status_nagios nagios
;;

stop)
echo “Stopping network monitor: nagios”
killproc_nagios nagios -9
/bin/ps -aef | grep -i nagios | grep -v grep | awk ‘{print $2}’ | xargs kill -9
rm -f $NagiosLog $NagiosTmp $NagiosRun $NagiosLckDir/$NagiosLckFile $NagiosCmd
;;

status)
status_nagios nagios
;;

restart)
printf “Running configuration check…”
$NagiosBin -v $NagiosCfg > /dev/null 2>&1;
if [ $? -eq 0 ]; then
echo “done”
$0 stop
$0 start
else
$NagiosBin -v $NagiosCfg
echo “failed – aborting restart.”
exit 1
fi
;;

reload|force-reload)
printf “Running configuration check…”
$NagiosBin -v $NagiosCfg > /dev/null 2>&1;
if [ $? -eq 0 ]; then
echo “done”
if test ! -f $NagiosRun; then
$0 start
else
NagiosPID=`head -n 1 $NagiosRun`
if status_nagios > /dev/null; then
printf “Reloading nagios configuration…”
killproc_nagios nagios -HUP
echo “done”
else
$0 stop
$0 start
fi
fi
else
$NagiosBin -v $NagiosCfg
echo “failed – aborting reload.”
exit 1
fi
;;

*)
echo “Usage: nagios {start|stop|restart|reload|force-reload|status}”
exit 1
;;

esac

# End of this script

Known Issue.-
Since This NCSA monitoring configuration failover based on per minute crontab setting so it may be sometimes if 192.168.1.x Nagios resumes service and 192.168.1.y Nagios is running then 192.168.1.y Nagios processes are killed next time when crontab reaches 1 minute cycle.

Advertisements

3 thoughts on “Linux – How to configure Nagios Failover with NSCA & Crontab

  1. Jayce November 17, 2011 at 8:52 pm Reply

    I was so cofnuesd about what to buy, but this makes it understandable.

    • Pait December 1, 2011 at 2:04 pm Reply

      Knwolgede wants to be free, just like these articles!

    • Delphia December 3, 2011 at 8:18 am Reply

      Thank God! Someone with bairns speaks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: