From owner-freebsd-fs@FreeBSD.ORG Sun Oct 5 15:50:55 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9F94E1CA for ; Sun, 5 Oct 2014 15:50:55 +0000 (UTC) Received: from mx1.internetx.com (mx1.internetx.com [62.116.129.39]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 51DC7793 for ; Sun, 5 Oct 2014 15:50:54 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mx1.internetx.com (Postfix) with ESMTP id CB40B1472003; Sun, 5 Oct 2014 17:50:45 +0200 (CEST) X-Virus-Scanned: InterNetX GmbH amavisd-new at ix-mailer.internetx.de Received: from mx1.internetx.com ([62.116.129.39]) by localhost (ix-mailer.internetx.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2TjjiEjPR5dy; Sun, 5 Oct 2014 17:50:42 +0200 (CEST) Received: from [192.168.100.26] (pizza.internetx.de [62.116.129.3]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.internetx.com (Postfix) with ESMTPSA id A82EC1472001; Sun, 5 Oct 2014 17:50:42 +0200 (CEST) Message-ID: <543168D0.2000705@internetx.com> Date: Sun, 05 Oct 2014 17:50:40 +0200 From: InterNetX - Juergen Gotteswinter Reply-To: jg@internetx.com User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: Dmitry Morozovsky , Mikolaj Golub Subject: Re: HAST with broken HDD References: <542BC135.1070906@Skynet.be> <542BDDB3.8080805@internetx.com> <542BF853.3040604@internetx.com> <542C019E.2080702@internetx.com> <542C0710.3020402@internetx.com> <97aab72e19d640ebb65c754c858043cc@SERVER.ad.usd-group.com> <20141003175439.GA7664@gmail.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" , Matt Churchyard X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Oct 2014 15:50:55 -0000 Am 05.10.2014 um 16:50 schrieb Dmitry Morozovsky: > On Fri, 3 Oct 2014, Mikolaj Golub wrote: > >> Disk errors are recorded to syslog. Also error counters are displayed >> in `hastctl list' output. There is snmp_hast(3) in base -- a module >> for bsnmp to retrieve this statistics via snmp protocol (traps are not >> supported though). >> >> For notifications, the hastd can be configured to execute an arbitrary >> command on various HAST events (see description for `exec' in >> hast.conf(5)). Unfortunately, it does not have hooks for I/O error >> events currently. It might be worth adding though. The problem with >> this that it may generate to many events, so some throttling is >> needed. > > And, I it, this should be noted, some kind of error-coalescing or similar > before going from "warning" shate (there are some read error, but otherwise the > disk is useable, and it would be overly hassle to switch to remote component > completely) to "error" state (component is unuseable and needs to be replaced > ASAP; drop it from HAST pair, and switchover if needed). > > Error such as "device lost" is, of course, fatal from the very beginning; but > -- how should we interpret, well, sporadic controller resets with the disk > coming back and catching syncing again? > > Hi Dmitry, since HAST is somehow not so different from DRBD, why dont take their way of Error Handling as "Template". DRBD works pretty well and rock solid since years, a well established Solution. HAST got the potencial to become this also, with some improvements. Just my 2 Cents :)