From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:31:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2AFC8106564A; Thu, 19 May 2011 23:31:49 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id D20768FC1C; Thu, 19 May 2011 23:31:48 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id 82310C01CE; Fri, 20 May 2011 01:31:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id upmAsX0955-T; Fri, 20 May 2011 01:31:47 +0200 (CEST) Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se [213.89.160.61]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 07771C0181; Fri, 20 May 2011 01:31:47 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Per von Zweigbergk In-Reply-To: <20110519230921.GF2100@garage.freebsd.pl> Date: Fri, 20 May 2011 01:31:46 +0200 Content-Transfer-Encoding: 7bit Message-Id: References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> To: Pawel Jakub Dawidek X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:31:49 -0000 20 maj 2011 kl. 01.09 skrev Pawel Jakub Dawidek: > On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote: >> Very well, that is how failures are handled. But how do we *recover* >> from a disk failure? Without taking the entire server down that is. > > HAST opens local disk only when changing role to primary or changing > role to secondary and accepting connection from primary. > If your disk fails, switch to init for that HAST device, replace you > disk, call 'hastctl create ' and switch back to primary or > secondary. If I were to do 'hastctl role init foo' to switch from primary->init, /dev/hast/foo would go away, and this would degrade whatever file system or volume manager you're running on top of HAST. (I just tried this in my HAST lab environment.) The scenario I was describing was a primary disk failure, I want to keep being able to access /dev/hast/foo while I replace the primary disk. I still don't see how it's possible to hot-replace a failed drive in the server that's primary at the time, there just doesn't seem to be any way of bringing in a new disk on the primary side without bringing down the HAST resource.