Date: Sun, 5 Oct 2014 18:50:57 +0400 (MSK) From: Dmitry Morozovsky <marck@rinet.ru> To: Mikolaj Golub <to.my.trociny@gmail.com> Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, Matt Churchyard <matt.churchyard@userve.net> Subject: Re: HAST with broken HDD Message-ID: <alpine.BSF.2.00.1410051846480.72273@woozle.rinet.ru> In-Reply-To: <20141003175439.GA7664@gmail.com> References: <542BC135.1070906@Skynet.be> <542BDDB3.8080805@internetx.com> <CA%2BdUSypO8xTR3sh_KSL9c9FLxbGH%2BbTR9-gPdcCVd%2Bt0UgUF-g@mail.gmail.com> <542BF853.3040604@internetx.com> <CA%2BdUSyp4vMB_qUeqHgXNz2FiQbWzh8MjOEFYw%2BURcN4gUq69nw@mail.gmail.com> <542C019E.2080702@internetx.com> <CA%2BdUSyoEcPdJ1hdR3k1vNROFG7p1kN0HB5S2a_0gYhiV75OLAw@mail.gmail.com> <542C0710.3020402@internetx.com> <CA%2BdUSyr9OK9SvN3wX-O4DeriLBP-EEuAA8TTSYwdGfcR1asdtQ@mail.gmail.com> <97aab72e19d640ebb65c754c858043cc@SERVER.ad.usd-group.com> <20141003175439.GA7664@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 3 Oct 2014, Mikolaj Golub wrote: > Disk errors are recorded to syslog. Also error counters are displayed > in `hastctl list' output. There is snmp_hast(3) in base -- a module > for bsnmp to retrieve this statistics via snmp protocol (traps are not > supported though). > > For notifications, the hastd can be configured to execute an arbitrary > command on various HAST events (see description for `exec' in > hast.conf(5)). Unfortunately, it does not have hooks for I/O error > events currently. It might be worth adding though. The problem with > this that it may generate to many events, so some throttling is > needed. And, I it, this should be noted, some kind of error-coalescing or similar before going from "warning" shate (there are some read error, but otherwise the disk is useable, and it would be overly hassle to switch to remote component completely) to "error" state (component is unuseable and needs to be replaced ASAP; drop it from HAST pair, and switchover if needed). Error such as "device lost" is, of course, fatal from the very beginning; but -- how should we interpret, well, sporadic controller resets with the disk coming back and catching syncing again? -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1410051846480.72273>