From owner-freebsd-fs@FreeBSD.ORG Fri Oct 3 17:54:46 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3A6703D3; Fri, 3 Oct 2014 17:54:46 +0000 (UTC) Received: from mail-lb0-x22e.google.com (mail-lb0-x22e.google.com [IPv6:2a00:1450:4010:c04::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 84B59EBD; Fri, 3 Oct 2014 17:54:45 +0000 (UTC) Received: by mail-lb0-f174.google.com with SMTP id p9so1444170lbv.33 for ; Fri, 03 Oct 2014 10:54:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=m0UBa/fFCjwQWBiIUMUk/UZVON2J+MYYiA6O4yM9X+c=; b=yxNIwewZhE5qH3BlHsjaq8RFRtdLzpA2L56nboc+Sm6LbG1F3UEMzsxsefDSE3VYfz PTEhZOegePRYb6TUErP3Jp9nu3ZhkG5JDcPOXjtKI5Rhhw49nmepWzVWSa1/V0326AJD su5/xEWG147rmu0qJIU3feFAPd8ybH3rdqo8HgsBn3EVt30AJYYlOYe25mBSnB9qfDA3 H654ao4ylqTBHcbfJk55MtIyjWOHRyvgaV/tVuz1086Wa58+JwMeItNy4toC7APtNlWn Oh2xtkrvohp0KOb49/vR6vVnIkx9THwPeeT2a7rJleYtyaMKKBXgqUNRNVZhqPb51wZg /ZxQ== X-Received: by 10.152.206.35 with SMTP id ll3mr7585332lac.88.1412358883470; Fri, 03 Oct 2014 10:54:43 -0700 (PDT) Received: from localhost ([91.245.78.254]) by mx.google.com with ESMTPSA id l13sm2901099lbh.32.2014.10.03.10.54.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Oct 2014 10:54:42 -0700 (PDT) Date: Fri, 3 Oct 2014 20:54:40 +0300 From: Mikolaj Golub To: Matt Churchyard Subject: Re: HAST with broken HDD Message-ID: <20141003175439.GA7664@gmail.com> References: <542BC135.1070906@Skynet.be> <542BDDB3.8080805@internetx.com> <542BF853.3040604@internetx.com> <542C019E.2080702@internetx.com> <542C0710.3020402@internetx.com> <97aab72e19d640ebb65c754c858043cc@SERVER.ad.usd-group.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <97aab72e19d640ebb65c754c858043cc@SERVER.ad.usd-group.com> User-Agent: Mutt/1.5.23 (2014-03-12) Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Oct 2014 17:54:46 -0000 On Wed, Oct 01, 2014 at 03:51:43PM +0000, Matt Churchyard wrote: > HAST is basically "RAID1-over-network", so if a disk fails, it > should just handle read/writes using the other disk, and the > filesystem on top, be it UFS/ZFS/whatever, should just carry on as > normal (which is what has been observed). Of course, HAST (or the > OS) should notify you of the disk error though (probably through > devd) so you can do something about it. Maybe it already exists, but > HAST should be able to provide overall status information and raise > events just like ZFS or any RAID subsystem would. You also of course > shouldn't get scrub errors and corruption like that seen in the > original post either just because one half of the HAST mirror has > gone. Disk errors are recorded to syslog. Also error counters are displayed in `hastctl list' output. There is snmp_hast(3) in base -- a module for bsnmp to retrieve this statistics via snmp protocol (traps are not supported though). For notifications, the hastd can be configured to execute an arbitrary command on various HAST events (see description for `exec' in hast.conf(5)). Unfortunately, it does not have hooks for I/O error events currently. It might be worth adding though. The problem with this that it may generate to many events, so some throttling is needed. -- Mikolaj Golub