Installing Nagios on Ubuntu Server 11.10 then Monitoring Windows and Exchange Servers–Part 4–Custom Exchange Monitoring Scripts…

Ok, I have to admit that this series has been a lot of fun (well for me anyway ~ I have had a blast) – but this part is perhaps by far and away the most fun that I have personally had – as it means that I have needed to develop some sample Powershell scripts for monitoring Exchange 2010 using Powershell with Nagios!

This is one of the real boons for Nagios – as by developing your own Powershell monitoring scripts which return values to the Nagios Core interface means that you are in essence using Nagios as a monitoring “wrapper” providing your with the flexibility to pretty much monitor any facet of your Exchange Systems.

Just as a quick overview to the series for those of you whom are just joining us:

  • In Part 1
    I covered how you can install Nagios Core 3.3.1 onto  an existing Ubuntu 11.10 server within your environment
  • In Part 2
    I covered how you can setup basic monitoring on your first Windows Server – making use of the NSClient++, I also covered how you could setup a basic “check_nt” monitoring service
  • In Part 3
    I covered how you can install the NRPE daemon onto your Nagios Server – and then use the NSClient to execute a basic Powershell script and report the output back into the Nagios interface

Custom Powershell for Exchange 2010 for use with Nagios

There are a couple key things to remember when developing custom Powershell scripts for use with Nagios:

  • You need to structure your scripts so that they always return an exit code which is proceeded by a Write-Host which adds a description to the last values returned (more on this later)
  • If you operate an environment that has a mix of DAG and Basic Servers (DAG and non-DAG) or for that matter – servers which operate multiple roles (MBX,CAS,HT) code your scripts so they can handle the absence of any specific features for example; if you develop a script that is designed to operate on DAG based servers, you should code in logic that can handle DAG features nothing being available – as a sample see the “GetDagnfo.ps1” script which is available for download below

Powershell exit codes for Nagios

As mentioned above – it is important that any Powershell that your write for use with Nagios returns and exit code.
Powershell has the ability to terminate execution with an exit code – this is demonstrated here – you can within your own scripts return any numeric decimal code that you want by using the “exit” statement.

However that being said, Nagios supports four exit status codes – which your scripts should conform to, these are as follows:

PowershellExitCodesForNagios

Exit codes 0 and 2 are reflected within the Nagios core interface like so:

Exit code 0:

NagiosCore024

Exit code 2:

NagiosCore025

Ok so what is the Powershell script behind the Backup Monitoring?

The script itself is pretty straight forward, and currently only checks for the lastFullBackup value on the Databases are resident on the Monitored Exchange Server.
Within the script there is a variable called $ThreshHold which reflects the number of days that can lapse without a full backup before the script will return an error to the Nagios Interface.

Add-PSSnapin Microsoft.Exchange.Management.PowerShell.E2010
$localServerName = Get-WmiObject -Class Win32_ComputerSystem | Select Name
$ThreshHold = 2
$Results = Get-MailboxDatabase -Server $localServerName.Name -Status | Select Identity,Server,LastFullBackup | where {$_.Server -eq $localServerName.Name}

foreach($itm in $Results){
    
        if($itm -eq $null){
            $Output = "OK: No Databases are active on this host"
            $NagiosResult = 0
        }else{
            if($itm.LastFullBackup -eq $null){
                $lastBackupSeed = 9999
                
            }else{
                $lastBackupSeed = New-TimeSpan $($itm.LastFullBackup) $(Get-Date)
            }
            if($lastBackupSeed.days -gt $ThreshHold -or $lastBackupSeed -eq 9999){
                $Res = "CRITICAL: Database Backup out of Schedule: " + $itm.Identity
                $Output += $Res + " "
                $statFlag = 1
            }else{
                $Output += "OK: Database: " + $itm.identity + " has a recent backup" + " "
            }
        }
}
Write-Host $Output
if($statFlag -eq 1){
    exit 2
}else{
    exit 0
}

Setting your scripts up

As detailed in part 3 – there are 5 steps that you need to follow in order to get your Exchange Monitoring Scripts reporting into the Nagios interface – these are as follows:

  1. On your Exchange Server where you have installed the NSClient++, Copy (or save) the scripts into the NSClient “Scripts” folder (this should be in “C:\Program Files\NSClient++\Scripts”).
  2. Perform the following within the NSC.ini file (this should be located in “C:\Program Files\NSClient++\”)

    For this example we will be using the “Exchange2010BackupMonitoring.ps1” script.
    Within the NSC.ini find the section entitled [ NRPE Handlers ] then add the following command:

check_exBackup=cmd /c echo scripts\Exchange2010BackupMonitoring.ps1; exit($lastexitcode) | powershell.exe -command -

Save the file, and then restart the “NSClient++” service from within the Windows Services Manager

  1. From your Nagios Server using Filezilla – download a copy of the commands.cfg (located in /usr/local/nagios/etc/objects)
    Open it within Notepad++ and add in the following command definition
define command{
    command_name    check_exbackup
    command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -t 120 -c check_exBackup
    }

Save the file, and then upload it back to your Nagios Server.

  1. From your Nagios Server using Filezilla – download a copy of the windows.cfg (located in /usr/local/nagios/etc/objects)
    Open it within Notepad++ and add in the following Service Definition for your Exchange Host:
define service{
use generic-service
host_name prod-ex2010-01.prepad.local
service_description Exchange DBs on Host
check_command check_exdb
}
  1. Logon to your Nagios Server using PuTTY and from the command line enter in the following command:
sudo /etc/init.d/nagios restart

Sample Monitoring Script Downloads

Below I have provided 5 sample Powershell scripts which your are free to use / modify as you see fit. These should be downloaded to your Exchange Server into the NSCilent++ Scripts Directory “C:\program files\NSClient++\Scripts
Before you use them, you will need to ensure that you have created the relevant service and command definitions (described above).

[ Exchange2010BackupMonitoring.ps1 – 1.3 KB ]
[ Exchange2010ContentIndexMonitor.ps1 – 1 KB ]
[ GetDAGnfo.ps1 – 1 KB ]
[ GetDAGReplicationStatus.ps1 – 1 KB ]
[ GetExchangeLocalDBs.ps1 – 1 KB ]

Sample Output within Nagios

The following is a sample output from the above monitoring scripts within my environment, naturally the ideal situation is to have all your statuses reporting “OKSmile as you can see I have a problem with Backups!

NagiosCore026

In part 5

In part 5 I will be covering the following:

  • Grouping Servers and Services into Categories
  • Troubleshooting Tips
  • Sample Exchange Service Files
  • Reporting
Sharing is caring!:

3 thoughts to “Installing Nagios on Ubuntu Server 11.10 then Monitoring Windows and Exchange Servers–Part 4–Custom Exchange Monitoring Scripts…”

  1. Hi Andy, thanks for these awesome how-to’s. I have been doing them like this and they work perfectly. I just have one question and i hope i am not being to forward. Will you be doing a how-to to show graphs, maybe like MRTG?

    Kind Regards
    Brad

    1. Hiya Brad. Glad that you like the series. Yes I do intend to do a part on extended monitoring – probably in part 6. I have part 5 to do next plus a bonus spin off two parter.
      Cheers

  2. Hey Andy!

    Thank you so much for all your work in this. I was trying to set up Nagios on Ubuntu 11.10 and found your blog, and it’s been VERY helpful. I’m using Nagios to monitor our Exchange environment too, so your posts have been amazing.

    I do have one question, as I’m newer to scripting and can’t figure it out. Your Exchange Database Repl script returns several “OK”s for the same servers, and I see in your screen shot it does the same in your environment. Why is that?

    Thanks again for all your hard work! I’ve been following along step by step, and you’ve been such a big help.

Leave a Reply

Your e-mail address will not be published. Required fields are marked *