Date: Mon, 30 Oct 2006 14:03:19 -0600 From: Mike Holloway <mikhollo@cisco.com> To: freebsd-proliant@freebsd.org Subject: Re: RAID monitoring tools Message-ID: <A0F424C9-7644-4CA7-9A56-043A9A5BA891@cisco.com> In-Reply-To: <AC543693FB6FA8C8E650ED94@ganymede.hub.org> References: <20061029043926.GI90772@k7.mavetju> <AC543693FB6FA8C8E650ED94@ganymede.hub.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Whoops, meant to copy the list... Appreciate the pointer to camcontrol, I previously had just been using swatch to watch syslog and send messages to nagios via nsca. My problem was that I never knew the initial state of the disks until an event happened in syslog. For reference, here's what I saw from camcontrol during my tests (FreeBSD 6.0 rel): During normal operation of the raid: # camcontrol inquiry da0 -D pass0: <COMPAQ RAID 1 VOLUME OK> Fixed Direct Access SCSI-0 device After removing one of the raid member disks: # camcontrol inquiry da0 -D pass0: <COMPAQ RAID 1 VOLUME inte> Fixed Direct Access SCSI-0 device After re-inserting the raid member disk: # camcontrol inquiry da0 -D pass0: <COMPAQ RAID 1 VOLUME reco> Fixed Direct Access SCSI-0 device And about 45 minutes later: # camcontrol inquiry da0 -D pass0: <COMPAQ RAID 1 VOLUME OK> Fixed Direct Access SCSI-0 device And here's the configuration I use for swatch to feed nsca in realtime: watchfor /ciss0.*removed/ exec "/usr/local/bin/nsca_report 2 \"Disk Array\" Hot-plug drive removed" watchfor /ciss0.*failure/ exec "/usr/local/bin/nsca_report 2 \"Disk Array\" Physical drive failure" watchfor /ciss0.*inserted/ exec "/usr/local/bin/nsca_report 1 \"Disk Array\" Hot-plug drive inserted" watchfor /ciss0.*recovery->recovering/ exec "/usr/local/bin/nsca_report 1 \"Disk Array\" Drive is rebuilding..." watchfor /ciss0.*recovering->OK/ exec "/usr/local/bin/nsca_report 0 \"Disk Array\" Drive has successfully rebuilt." For completeness, here's the nsca_report script that I use to send the alarms to nagios, substitute your own thishost and -H: #!/bin/bash outcode=$1 thisservice=$2 thishost=`echo $HOSTNAME | sed -e "s/\./ /g" | cut -f 1 -d ' '` shift shift echo -e "${thishost}\t${thisservice}\t${outcode}\t$*\n" | /usr/local/ bin/send_nsca -H www -c /usr/local/etc/send_nsca.cfg 2>&1 >> /dev/null -mike On Oct 29, 2006, at 12:51 AM, Marc G. Fournier wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > camcontrol devlist: > > <COMPAQ RAID 1 VOLUME OK> at scbus0 target 0 lun 0 > (pass0,da0) > > I don't have *regular* monitoring on it, mind you, just do it > periodically, by > hand ... > > > > - --On Sunday, October 29, 2006 15:39:26 +1100 Edwin Groothuis > <edwin@mavetju.org> wrote: > >> Greetings, >> >> Last week we had two failing disks, and if it wasn't for a walk >> through the datacenter (which is off-site, and ten dollars away) >> we wouldn't have noticed it. I've read the thread about hpacucli, >> and have had my failed attempts to get it up and running under the >> linuxolator. >> >> So the question is: how do *you* monitor the status of your disks >> and RAID arrays? Any suggestions will be appriciated. >> >> Edwin >> >> -- >> Edwin Groothuis | Personal website: http:// >> www.mavetju.org >> edwin@mavetju.org | Weblog: http:// >> weblog.barnet.com.au/edwin/ >> _______________________________________________ >> freebsd-proliant@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-proliant >> To unsubscribe, send any mail to "freebsd-proliant- >> unsubscribe@freebsd.org" > > > > - ---- > Marc G. Fournier Hub.Org Networking Services (http:// > www.hub.org) > Email . scrappy@hub.org MSN . > scrappy@hub.org > Yahoo . yscrappy Skype: hub.org ICQ . 7615664 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.5 (FreeBSD) > > iD8DBQFFRDM94QvfyHIvDvMRAtNdAKC+AYhavYxQ4qZzP4/zqsBfLirE6gCbBebW > Oxd406ykkw1tElrfzn1Y/zM= > =fgIA > -----END PGP SIGNATURE----- > > _______________________________________________ > freebsd-proliant@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-proliant > To unsubscribe, send any mail to "freebsd-proliant- > unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A0F424C9-7644-4CA7-9A56-043A9A5BA891>