Date: Wed, 8 Dec 2010 09:07:03 GMT From: Marian Jamrich <jamrich.majo@gmail.com> To: freebsd-gnats-submit@FreeBSD.org Subject: ports/152916: [new port] net-mgmt/nagios-check_hdd_health , is a Nagios plug-in written in shell to check your HDD health using SmartMonTools Message-ID: <201012080907.oB8973tD062888@red.freebsd.org> Resent-Message-ID: <201012080910.oB89ABNO000565@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 152916 >Category: ports >Synopsis: [new port] net-mgmt/nagios-check_hdd_health , is a Nagios plug-in written in shell to check your HDD health using SmartMonTools >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-ports-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Wed Dec 08 09:10:11 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Marian Jamrich >Release: 8.2 prerelease >Organization: >Environment: >Description: check_hdd_health is a Nagios plug-in written in shell to check your HDD health using SmartMonTools. This script check HDD from S.M.A.R.T this values: - Spin Retry Count - Reallocated Sector Ct - Reallocated Event Count - Current Pending Sector - Offline Uncorrectable - Total health test >How-To-Repeat: >Fix: Patch attached with submission follows: # This is a shell archive. Save it in a file, remove anything before # this line, and then unpack it by entering "sh file". Note, it may # create directories; files and directories will be owned by you and # have default permissions. # # This archive contains: # # nagios-check_hdd_health # nagios-check_hdd_health/src # nagios-check_hdd_health/src/check_hdd_health # nagios-check_hdd_health/Makefile # nagios-check_hdd_health/pkg-plist # nagios-check_hdd_health/distinfo # nagios-check_hdd_health/pkg-descr # echo c - nagios-check_hdd_health mkdir -p nagios-check_hdd_health > /dev/null 2>&1 echo c - nagios-check_hdd_health/src mkdir -p nagios-check_hdd_health/src > /dev/null 2>&1 echo x - nagios-check_hdd_health/src/check_hdd_health sed 's/^X//' >nagios-check_hdd_health/src/check_hdd_health << '3e0fb3b894d73c34d2fa24438a0e6a90' X#!/bin/sh X# XPATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/usr/local/bin X XST_OK=0 XST_WR=1 XST_CR=2 XST_UN=3 X Xsmartctl=$(which smartctl) X X## Smartmontools XSMT=Smartmontools X X# Plugin name XPROGNAME=`basename $0` X X# Version XVERSION="Version 1.0" X X# Author XAUTHOR="Marian Jamrich" X XTMPFILE=/tmp/smart.nagios.$$ X X# Clean up when done or when aborting Xtrap "rm -f ${TMPFILE}" 0 1 2 3 15 X X#print_version() { X# echo "$PROGNAME $VERSION $1" X#} X Xmini_help() { X echo "Usage $0 --device $device --without [src rsc rec cps ou]" X} X Xprint_help() { X clear; X echo "*********************************************************************************" X echo "* $PROGNAME $VERSION $1""($AUTHOR) <jamrich.majo@gmail.com> (2010) *" X echo "*********************************************************************************" X echo "This is Nagios plugin to check HDD health from S.M.A.R.T. by Smartmontools." X echo ' XThe S.M.A.R.T. attributes are specific properties (parameters) of various parts of a disk. XS.M.A.R.T. uses attributes to monitor the disk condition and to analyze its reliability. X XScript check HDD from S.M.A.R.T with the following properties (if your HDD supports it): X X** Spin Retry Count (src) ** XCount of retry of spin start attempts. This attribute stores a total count of the spin start attempts to reach the fully operational speed (under the Xcondition that the first attempt was unsuccessful). A decrease of this attribute value is a sign of problems in the hard disk mechanical subsystem. X X** Reallocated Sector Count (rsc) ** XCount of reallocated sectors. When the hard drive finds a read/write/verification error, it marks this sector as "reallocated" and transfers data to a Xspecial reserved area (spare area). This process is also known as remapping and "reallocated" sectors are called remaps. This is why, on a modern hard Xdisks, you can not see "bad blocks" while testing the surface - all bad blocks are hidden in reallocated sectors. X X** Reallocated Event Count (rec) ** XCount of remap operations (transferring data from a bad sector to a special reserved disk area - spare area). The raw value of this attribute shows the Xtotal number of attempts to transfer data from reallocated sectors to a spare area. Unsuccessful attempts are counted as well as successful. X X** Current Pending Sector (cps) ** XCurrent count of unstable sectors (waiting for remapping). The raw value of this attribute indicates the total number of sectors waiting for remapping. XLater, when some of these sectors are read successfully, the value is decreased. If errors still occur when reading some sector, the hard drive will try Xto restore the data, transfer it to the reserved disk area (spare area) and mark this sector as remapped. If this attribute value remains at zero, it Xindicates that the quality of the corresponding surface area is low. X X** Offline Uncorrectable (ou) ** XQuantity of uncorrectable errors. The raw value of this attribute indicates the total number of uncorrectable errors when reading/writing a sector. XA rise in the value of this attribute indicates that there are evident defects of the disk surface and/or there are problems in the hard disk drive Xmechanical subsystem. X X** Total health test (pass) ** XThis is test provided by Smartmontools. If total disk state is "health", Smartmontools marked as "PASSED". X ' X echo "Nagios states:" X echo X echo "OK - if all values are \"0\"." X echo "Warning - if one or both values \"Spin Retry Count\" and \"Reallocated Event Count\" is between the values 1 to 9." X echo "Critical - if some value is greater than \"0\" except \"Spin Retry Count (>=10)\" and \"Reallocated Event Count (>=10)\"." X echo -e "\n---------------------------------------------------------------------" X echo "Usage:" X echo "$0 --device /dev/ad0 [ --without [src rsc rec cps ou]]" X echo "---------------------------------------------------------------------" X exit $ST_UN X} X Xcase "$1" in X --help|-h|--usage|-u) X print_help X exit $ST_UN X ;; X -d | --device) X device=$2 X ;; X -V) X print_version X exit X ;; X *) X echo "Unknown argument: $1" X echo "For more information please try -h or --help!" X exit $ST_UN X ;; Xesac Xshift X Xtest -z $device && echo -e "\nYou forgot to define device! Please try \"-h or --help\" to help." && exit $ST_UN Xtest `uname` != "FreeBSD" && echo "This plugin is only for FreeBSD." && exit $ST_UN X Xif [ ! -e $device ]; then X echo X echo "Unknown device \"$device\"!" X exit $ST_UK Xfi X Xif [ -z $smartctl ]; then X echo -e "\nYou don't have installed $SMT. Please install it at http://smartmontools.sourceforge.net or pkg_add -r \"smartmontools\"..." X exit $ST_UN Xfi X X$smartctl -a $device > ${TMPFILE} XSMART_SUPPORT=`awk '/SMART support is/ {print $4}' ${TMPFILE} | tail -n 1` X Xif [ "${SMART_SUPPORT}" = "Unavailable" ]; then X echo -e "\nS.M.A.R.T support is Unavailable for $device !!! You should enable it \"smartctl -s on $device\"." X exit $ST_UN Xelif [ "${SMART_SUPPORT}" != "Enabled" ]; then X echo -e "\nMaybe you don't have enabled S.M.A.R.T support in $SMT! Please type \"smartctl -s on $device\" that you have it turned on. Or device does not support S.M.A.R.T function." X exit $ST_UN Xfi X X## start S.M.A.R.T test and set variables Xsrc=`awk '/Spin_Retry_Count/ {print $10}' ${TMPFILE} ` Xrsc=`awk '/Reallocated_Sector_Ct/ {print $10}' ${TMPFILE} ` Xrec=`awk '/Reallocated_Event_Count/ {print $10}' ${TMPFILE} ` Xcps=`awk '/Current_Pending_Sector/ {print $10}' ${TMPFILE} ` Xou=`awk '/Offline_Uncorrectable/ {print $10}' ${TMPFILE} ` Xpass=`awk -F\: '/test result/ { if ( $2 == " PASSED") print "PASSED"; else print "FAILED" }' ${TMPFILE} ` X X## if one or more S.M.A.R.T function is not supported by your HDD, then you define --without variable and then value is set to "0" Xargs=`getopt w:without: $*` Xfor arg; do X case "$arg" in X src) src=0;; X rsc) rsc=0;; X rec) rec=0;; X cps) cps=0;; X ou) ou=0;; X esac Xdone X X# test if your HDD support all parameters: X[ -z "$src" ] && echo -e "***********\n** ERROR **\n***********\n${device} don't support Spin_Retry_Count. Please try \"--without src\"." && mini_help && exit $ST_UN X[ -z "$rsc" ] && echo -e "***********\n** ERROR **\n***********\n${device} don't support Reallocated_Sector_Ct. Please try \"--without rsc\"." && mini_help && exit $ST_UN X[ -z "$rec" ] && echo -e "***********\n** ERROR **\n***********\n${device} don't support Reallocated_Event_Count. Please try --without rec." && mini_help && exit $ST_UN X[ -z "$cps" ] && echo -e "***********\n** ERROR **\n***********\n${device} don't support Current_Pending_Sector. Please try --without cps." && mini_help && exit $ST_UN X[ -z "$ou" ] && echo -e "***********\n** ERROR **\n***********\n${device} don't support Offline_Uncorrectable. Please try \"--without ou\"." && mini_help && exit $ST_UN X Xperfdata="smart=src=$src; rsc=$rsc; rec=$rec; cps=$cps; ou=$ou; pass=$pass" X X##### finally run test, print result and set exit code ##### Xif [ $src -eq 0 ] && [ $rsc -eq 0 ] && [ $rec -eq 0 ] && [ $cps -eq 0 ] && [ $ou -eq 0 ] && [ "$pass" = "PASSED" ]; then X echo "OK - HDD S.M.A.R.T health: src=$src, rsc=$rsc, rec=$rec, cps=$cps, ou=$ou, HEALTH_STATUS=$pass for $device. |${perfdata}" X exit $ST_OK Xelif [ $src -gt 1 -a $src -lt 10 ] && [ $rsc -gt 0 ] && [ $rec -gt 1 -a $rec -lt 10 ] && [ $cps -eq 0 ] && [ $ou -eq 0 ] && [ "$pass" = "PASSED" ]; then X echo "WARNING - HDD S.M.A.R.T health: src=$src, rsc=$rsc, rec=$rec, cps=$cps, ou=$ou, HEALTH_STATUS=$pass for $device. |${perfdata}" X exit $ST_WR Xelse X echo "CRITICAL - HDD S.M.A.R.T health: src=$src, rsc=$rsc, rec=$rec, cps=$cps, ou=$ou, HEALT_STATUS=$pass for $device. |${perfdata}" X exit $ST_CR Xfi 3e0fb3b894d73c34d2fa24438a0e6a90 echo x - nagios-check_hdd_health/Makefile sed 's/^X//' >nagios-check_hdd_health/Makefile << '7ed075629eb33d9aa8f807e38775d59b' X# New ports collection makefile for: nagios-check_hdd_health X# Date created: 2010-12-02 X# Whom: jamrich.majo@gmail.com X# X# $FreeBSD$ X# X XPORTNAME= nagios-check_hdd_health XPORTVERSION= 1.0 XCATEGORIES= net-mgmt XMASTER_SITES= http://www.bwelectronics.sk/jamrich/ports/ X XMAINTAINER= jamrich.majo@gmail.com XCOMMENT= Nagios plug-in to check HDD health from S.M.A.R.T X XRUN_DEPENDS= smartmontools:${PORTSDIR}/sysutils/smartmontools X XNO_BUILD= yes X Xdo-install: X @${MKDIR} ${PREFIX}/libexec/nagios X @${INSTALL_SCRIPT} ${.CURDIR}/src/check_hdd_health ${PREFIX}/libexec/nagios X X.include <bsd.port.mk> 7ed075629eb33d9aa8f807e38775d59b echo x - nagios-check_hdd_health/pkg-plist sed 's/^X//' >nagios-check_hdd_health/pkg-plist << '3545e447b1d4329374e590dca22588fe' Xlibexec/nagios/check_hdd_health X@dirrmtry libexec/nagios 3545e447b1d4329374e590dca22588fe echo x - nagios-check_hdd_health/distinfo sed 's/^X//' >nagios-check_hdd_health/distinfo << '7a436dce8ae00348f45ae9940176f76b' XSHA256 (nagios-check_hdd_health-1.0.tar.gz) = e3dcad96d451bbc978d165682bfb9f1669fedf197fc96af971fe7d026fe47d1c XSIZE (nagios-check_hdd_health-1.0.tar.gz) = 3445 7a436dce8ae00348f45ae9940176f76b echo x - nagios-check_hdd_health/pkg-descr sed 's/^X//' >nagios-check_hdd_health/pkg-descr << 'ff21b419f60ef9ce0797711e7325fc01' Xcheck_hdd_health is a Nagios plug-in written in shell to check HDD health. XThis script check HDD from S.M.A.R.T this values: X- Spin Retry Count X- Reallocated Sector Ct X- Reallocated Event Count X- Current Pending Sector X- Offline Uncorrectable X- Total health test ff21b419f60ef9ce0797711e7325fc01 exit >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201012080907.oB8973tD062888>