From owner-freebsd-fs@FreeBSD.ORG Wed Oct 1 14:00:47 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DB66E299 for ; Wed, 1 Oct 2014 14:00:46 +0000 (UTC) Received: from mail-wi0-x22e.google.com (mail-wi0-x22e.google.com [IPv6:2a00:1450:400c:c05::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 691A1A9E for ; Wed, 1 Oct 2014 14:00:46 +0000 (UTC) Received: by mail-wi0-f174.google.com with SMTP id cc10so661981wib.13 for ; Wed, 01 Oct 2014 07:00:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=cGOO6F8Rx55sCgOxCARfy6C151r9MOiqbpdi+HMoIJY=; b=JBvEeVyZ53/hi5JQxzPMu0MsNZQPC5E1cNQJNS/gLUXKINgncvUJVaycMmbobp8i0T ymX6FHcrkZnLou0tYCc/h2E1Tt8C1pRIMtFWAeWxwQgANz7ExNqqCvPZkFkLhkyaEZQS oOESMAITXNZhM9W4fGIX/lxN49OuAwqqfgH1mCfyw0N+d4NKGH85WBYMDNdykiRs5JJc dsCaiFIWpZabMliJGi1NnwK1Uv7+QGOHCnPVUsckzDwRfR6XdYCD1ptDtLvrqu3jj3iK HgHICZCniOe/WnchADZeCRSZinMR5CWfn1V/aY2gZyWRxEJIL2FTgA0pOE0OkB1FYUPf Kifw== MIME-Version: 1.0 X-Received: by 10.180.97.199 with SMTP id ec7mr14789757wib.29.1412172044638; Wed, 01 Oct 2014 07:00:44 -0700 (PDT) Received: by 10.27.137.130 with HTTP; Wed, 1 Oct 2014 07:00:44 -0700 (PDT) In-Reply-To: <542C0710.3020402@internetx.com> References: <542BC135.1070906@Skynet.be> <542BDDB3.8080805@internetx.com> <542BF853.3040604@internetx.com> <542C019E.2080702@internetx.com> <542C0710.3020402@internetx.com> Date: Wed, 1 Oct 2014 17:00:44 +0300 Message-ID: Subject: Re: HAST with broken HDD From: George Kontostanos To: jg@internetx.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Oct 2014 14:00:47 -0000 On Wed, Oct 1, 2014 at 4:52 PM, InterNetX - Juergen Gotteswinter < jg@internetx.com> wrote: > Am 01.10.2014 um 15:49 schrieb George Kontostanos: > > On Wed, Oct 1, 2014 at 4:29 PM, InterNetX - Juergen Gotteswinter > > > wrote: > > > > Am 01.10.2014 um 15:06 schrieb George Kontostanos: > > > > > > > > > On Wed, Oct 1, 2014 at 3:49 PM, InterNetX - Juergen Gotteswinter > > > jg@internetx.com > > >> wrote: > > > > > > Am 01.10.2014 um 14:28 schrieb George Kontostanos: > > > > > > > > On Wed, Oct 1, 2014 at 1:55 PM, InterNetX - Juergen > Gotteswinter > > > > jg@internetx.com > > > > > > jg@internetx.com > > >>> wrote: > > > > > > > > Am 01.10.2014 um 10:54 schrieb JF-Bogaerts: > > > > > Hello, > > > > > I'm preparing a HA NAS solution using HAST. > > > > > I'm wondering what will happen if one of disks of > the > > > primary node will > > > > > fail or become erratic. > > > > > > > > > > Thx, > > > > > Jean-Fran=C3=A7ois Bogaerts > > > > > > > > nothing. if you are using zfs on top of hast zfs wont > even > > > take notice > > > > about the disk failure. > > > > > > > > as long as the write operation was sucessfull on one of > the 2 > > > nodes, > > > > hast doesnt notify the ontop layers about io errors. > > > > > > > > interesting concept, took me some time to deal with thi= s. > > > > > > > > > > > > Are you saying that the pool will appear to be optimal even > with a bad > > > > drive? > > > > > > > > > > > > > > https://forums.freebsd.org/viewtopic.php?&t=3D24786 > > > > > > > > > > > > It appears that this is actually the case. And it is very > disturbing, > > > meaning that a drive failure goes unnoticed. In my case I > completely > > > removed the second disk on the primary node and a zpool status > showed > > > absolutely no problem. Scrubbing the pool began resilvering which > > > indicates that there is actually something wrong! > > > > > > right. lets go further and think how zfs works regarding direct > hardware > > / disk access. theres a layer between which always says ey, > everthing is > > fine. no more need for pool scrubbing, since hastd wont tell if > anything > > is wrong :D > > > > > > Correct, ZFS needs direct access and any layer in between might end up = a > > disaster!!! > > > > Which means that practically HAST should only be used in UFS > > environments backed by a hardware controller. In that case, HAST will > > not notice again anything (unless you loose the controller) but at leas= t > > you will know that you need to replace a disk, by monitoring the > > controller status. > > > > imho this should be included at least as a notice/warning in the hastd > manpage, afaik theres no real warning about such problems with the > hastd/zfs combo. but lots of howtos are out there describing exactly > such setups. > > Yes, it should. I have actually written a guide like that when HAST was a= t its early stages. I had never tested it though for flaws. This thread started ringing some bells! > sad, since the comparable piece on linux - drbd - is handling io errors > fine. the upper layers get notified like it should be imho > > My next lab environment will be to try a DRBD similar set up. Although some tests we performed last year with ZFS on linux were not that promising. --=20 George Kontostanos --- http://www.aisecure.net