check_netappfiler - Nagios plugin

for Network Appliance (NetApp) Filer/FAS Systems

OBSOLETE!

You're looking at the homepage of the old "check_netappfiler" plugin

If you want to start monitoring your NetApp FAS with Nagios/Icinga/Shinken, please go to http://oss.teamix.org/projects/monitoringplugins/wiki/Check_naf

No more new features, but only fixes will happen here!

OUTDATED! Latest version: 0.1.[an error occurred while processing this directive] ( 2011-01-11 )

Features & PerfData (with PNP)

At the moment the plugin can monitor the following "subsystems" (* = with PerfData). For PNP there are some templates included (PNP/templates/)

Subsystem PNP/Image
global
Global system status
No PerfData
cpu - CPU usage
environment
Monitor fans, power supplies, temperature
No PerfData
nvram
NVRAM battery state
No PerfData
sparedisk
and/or
faileddisk (disk)
cluster
cluster state (thx to Rico Glöckner for fixing this ;-)
No PerfData
snapmirror
state of snapmirrors, needs (more) testing!
No PerfData
cacheage
Age of cache in minutes
vol
usage of volumes/aggregates, including snapshot (reserve)
fs (obsolete) No example! Please use "vol"-subsystem instead!
cifs-stats
CIFS stats (patch from and thanks to Jochen Bartl)
cifs-users
CIFS user connected (patch from and thanks to Jochen Bartl)

PNP templates are in "PNP/templates"

Usage on command line

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s global
NETAPP(global) OK - FAS3140: The system's global status is normal. 

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s cpu
NETAPP(cpu) OK - CPU Busy: 0%, Context Switches: 22440091, CPU Architecture: amd64_FIXME|nacpu=0%;80;90;0;100 nacs=22440091c

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s environment
NETAPP(environment) OK - Filer is happy with his environment ;-)

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s nvram
NETAPP(nvram) OK - NVRAM battery status is "ok"

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s sparedisk
NETAPP(sparedisk) OK - Disk stats: 14 total, 3 active, 11 spare, 0 failed|nadisk_total=14;;;0; nadisk_active=3;;;0;14 nadisk_spare=11;0;0;0;14 nadisk_failed=0;;;0;14

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s faileddisk
NETAPP(faileddisk) OK - Disk stats: 14 total, 3 active, 11 spare, 0 failed|nadisk_total=14;;;0; nadisk_active=3;;;0;14 nadisk_spare=11;;;0;14 nadisk_failed=0;0;0;0;14

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s cifs-users
NETAPP(cifs-users) OK - 2 connected users|cifs_users=2;;;0;

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s cifs-stats
NETAPP(cifs-stats) OK - OK|total_ops=2490c;;;0; total_calls=2583c;;;0; bad_calls=0c;;;0; get_attrs=899c;;;0; reads=118c;;;0; writes=54c;;;0; locks=8c;;;0; opens=580c;;;0; dirops=602c;;;0; others=229c;;;0;

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s cluster
NETAPP(cluster) OK - Cluster settings: enabled, state: canTakeover, interconnect state: up

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s snapmirror
NETAPP(snapmirror) OK - SnapMirror is on

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s snapmirror -f 1
NETAPP(snapmirror) OK - Snapmiror state is 'snapmirrored'. Source: 'otherfas:/vol/foo/-', Destination: 'mytoaster:/vol/bar/noqtreedata', Status: 'idle'

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s cacheage
NETAPP(cacheage) OK - Cache Age 13 minutes|nacacheage=13;;;0;

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s fs -f aggr0 -w 50 -c 75
NETAPP(fs) CRITICAL - aggregate "aggr0": 95% used (112993676kB out of 119052776kB), INodes: 0% used, status: mounted|nafs_aggr0=115705524224B;60955021312;91432531968;0;121910042624

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s fs -f /vol/vol0 -w 50 -c 75
NETAPP(fs) OK - flexibleVolume "/vol/vol0/": 0% used (300108kB out of 90390400kB), INodes: 0% used, status: mounted|nafs_/vol/vol0/=307310592B;46279884800;69419827200;0;92559769600

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s fs -f 1 -w 50 -c 75
NETAPP(fs) CRITICAL - aggregate "aggr0": 95% used (112993676kB out of 119052776kB), INodes: 0% used, status: mounted|nafs_aggr0=115705524224B;60955021312;91432531968;0;121910042624

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s fs -f 2 -w 50 -c 75
NETAPP(fs) OK - aggregate "aggr0/.snapshot": 7% used (436892kB out of 6265932kB), INodes: 0% used, status: mounted|nafs_aggr0/.snapshot=447377408B;3208157184;4812235776;0;6416314368

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s fs -f 3 -w 50 -c 75
NETAPP(fs) OK - flexibleVolume "/vol/vol0/": 0% used (300108kB out of 90390400kB), INodes: 0% used, status: mounted|nafs_/vol/vol0/=307310592B;46279884800;69419827200;0;92559769600

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s fs -f 4 -w 50 -c 75
NETAPP(fs) OK - flexibleVolume "/vol/vol0/.snapshot": 0% used (30864kB out of 22597600kB), INodes: 0% used, status: mounted|nafs_/vol/vol0/.snapshot=31604736B;11569971200;17354956800;0;23139942400

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s vol -f aggr0 -w 50% -c 75%
NETAPP(vol) CRITICAL - aggregate "aggr0": 94.9% used (112993676kB out of 119052776kB), INodes: 0% used, status: mounted|navoldata_aggr=115705524224B;60955021312;91432531968;0;121910042624 navolsnap_aggr=447377408B;;;0;6416314368 nadatasize_aggr=121910042624B nasnapsize_aggr=6416314368B

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s vol -f /vol/vol0 -w 50% -c 75%
NETAPP(vol) OK - flexibleVolume "/vol/vol0/": 0.3% used (300108kB out of 90390400kB), INodes: 0% used, status: mounted|navoldata_vol0=307310592B;46279884800;69419827200;0;92559769600 navolsnap_vol0=31604736B;;;0;23139942400 nadatasize_vol0=92559769600B nasnapsize_vol0=23139942400B

nagios:~% ./check_netappfiler_netsnmp.py -H mytoaster.office.lan -s vol -f 3 -w 50% -c 75%
NETAPP(vol) OK - flexibleVolume "/vol/vol0/": 0.3% used (300108kB out of 90390400kB), INodes: 0% used, status: mounted|navoldata_vol0=307310592B;46279884800;69419827200;0;92559769600 navolsnap_vol0=31604736B;;;0;23139942400 nadatasize_vol0=92559769600B nasnapsize_vol0=23139942400B

Complete ChangeLog:

2011-01-11:
===========
- Move old scripts "check_netappfiler(|_netsnmp).py" to old/
- Rename "check_naf.py" to "check_netappfiler.py"
- Adjust test scripts "testall*.sh"

2009-03-19:
===========
- Re-Added "version" subsystem due to user request
- Added check if SnapMirror is licensed
- Check if vol/fs is found or not

2009-03-17:
===========
- Fixes to "snapmirror" subsystem - did it every work?!?

2008-12-23:
===========
- check_netappfiler_netsnmp.py
  + Experimental snmpwalk implementation
    * Delete lines 237&238 of Debian's libsnmp-python in
      /usr/share/pycentral/libsnmp-python/site-packages/netsnmp/client.py
      or your "client.py" - prints silly debugging information
  + New subsystem "cp" (consistency points) for stats&graphing
- PNP: template added for "cp"

2008-12-22:
===========
- New subsystem "cacheage", added to testall*
- PNP: Added and fixed some templates

2008-12-15:
===========
- New PNP templates for "cpu", "sparedisk" and (obsolete) "fs"
- PNP: check_command definition file

2008-12-11:
===========
- Small fix in check_netappfiler_netsnmp.py
- Small fixes in testall*.sh

2008-12-02:
===========
- Added some caching (with "shelve") for FSIDs
  + "--cache FILENAME" option added
  + Reduce SNMPGETs on target, (much) faster now!!1!
  + verifies cached fsid
  + Use one cache file per host, e.g.:
    ... --cache /var/tmp/nafscache_$HOSTADDRESS$ ...

2008-11-21:
===========
- Some code cleanup
- New subsystem "vol"
  + More accurate check of volume incl. snapshot
  + Same parameter as "fs", see testall*.s
  + You *must* provide "-w" and "-c"!
  + PNP-Template (colors WILL change in the future)
- New "plugin": check_netappfiler_netsnmp.py
  You need the Python bindings of net-snmp!
  Debian: Use Lenny or ...
          deb http://people.teamix.net/~svelt/debian/etch/net-snmp/ ./
          in /etc/apt/sources.list for Etch

2008-10-15:
===========
- Fix for cluster check (Rico Glöckner, thanks!)

2008-09-17:
===========
- Heavy code cleaning
- Long options no more available!
- Initial support for ONTAP 7.3:
  + SNMP v2c and v3! w00t! Counter64!
  + New file: README.v3
- Last fix for >2TB volumes only works for volumes <4TB
  + fix for the fix, 2 additional SNMPGETs, more calculating :-/ or
  + just use SNMP v2c or v3 - if you have ONTAP 7.3 :-)

2008-07-29:
===========
- Fix for volumes > 2TB. Thank god for 32 bit signed integers :-/

2008-04-24:
===========
- more work on "snapmirror", testing needed!

2008-04-23:
===========
- New EXPERIMENTAL subsystem "cluster", please test and send feedback
- Start of new subsystem "snapmirror", more work to do!

2008-04-22:
===========
- Subsystem "fs" now accepts path names to "-f"
- New subsystem "faileddisk"
- Subsystem "sparedisk" alias to (old) "disk"
- New Subsystems "cifs-users" and "cifs-stats"
  Thanks to Jochen Bartl!