Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 05 Oct 2014 17:58:58 +0200
From:      InterNetX - Juergen Gotteswinter <juergen.gotteswinter@internetx.com>
To:        Dmitry Morozovsky <marck@rinet.ru>,  Mikolaj Golub <to.my.trociny@gmail.com>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>, Matt Churchyard <matt.churchyard@userve.net>
Subject:   Re: HAST with broken HDD
Message-ID:  <54316AC2.5040909@internetx.com>
In-Reply-To: <543168D0.2000705@internetx.com>
References:  <542BC135.1070906@Skynet.be> <542BDDB3.8080805@internetx.com> <CA%2BdUSypO8xTR3sh_KSL9c9FLxbGH%2BbTR9-gPdcCVd%2Bt0UgUF-g@mail.gmail.com> <542BF853.3040604@internetx.com> <CA%2BdUSyp4vMB_qUeqHgXNz2FiQbWzh8MjOEFYw%2BURcN4gUq69nw@mail.gmail.com> <542C019E.2080702@internetx.com> <CA%2BdUSyoEcPdJ1hdR3k1vNROFG7p1kN0HB5S2a_0gYhiV75OLAw@mail.gmail.com> <542C0710.3020402@internetx.com> <CA%2BdUSyr9OK9SvN3wX-O4DeriLBP-EEuAA8TTSYwdGfcR1asdtQ@mail.gmail.com> <97aab72e19d640ebb65c754c858043cc@SERVER.ad.usd-group.com> <20141003175439.GA7664@gmail.com> <alpine.BSF.2.00.1410051846480.72273@woozle.rinet.ru> <543168D0.2000705@internetx.com>

next in thread | previous in thread | raw e-mail | index | archive | help


Am 05.10.2014 um 17:50 schrieb InterNetX - Juergen Gotteswinter:
> 
> 
> Am 05.10.2014 um 16:50 schrieb Dmitry Morozovsky:
>> On Fri, 3 Oct 2014, Mikolaj Golub wrote:
>>
>>> Disk errors are recorded to syslog. Also error counters are displayed
>>> in `hastctl list' output. There is snmp_hast(3) in base -- a module
>>> for bsnmp to retrieve this statistics via snmp protocol (traps are not
>>> supported though).
>>>
>>> For notifications, the hastd can be configured to execute an arbitrary
>>> command on various HAST events (see description for `exec' in
>>> hast.conf(5)). Unfortunately, it does not have hooks for I/O error
>>> events currently. It might be worth adding though. The problem with
>>> this that it may generate to many events, so some throttling is
>>> needed.
>>
>> And, I it, this should be noted, some kind of error-coalescing or similar 
>> before going from "warning" shate (there are some read error, but otherwise the 
>> disk is useable, and it would be overly hassle to switch to remote component 
>> completely) to "error" state (component is unuseable and needs to be replaced 
>> ASAP; drop it from HAST pair, and switchover if needed). 
>>
>> Error such as "device lost" is, of course, fatal from the very beginning; but 
>> -- how should we interpret, well, sporadic controller resets with the disk 
>> coming back and catching syncing again?
>>
>>
> 
> Hi Dmitry,
> 
> since HAST is somehow not so different from DRBD, why dont take their
> way of Error Handling as "Template". DRBD works pretty well and rock
> solid since years, a well established Solution. HAST got the potencial
> to become this also, with some improvements.
> 
> Just my 2 Cents :)
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> 

forgot this one...

http://www.drbd.org/users-guide/s-handling-disk-errors.html



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54316AC2.5040909>