From owner-freebsd-questions@FreeBSD.ORG  Thu Feb 21 22:00:49 2013
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 0621D93F;
 Thu, 21 Feb 2013 22:00:49 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-ee0-f42.google.com (mail-ee0-f42.google.com [74.125.83.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 56E21F22;
 Thu, 21 Feb 2013 22:00:48 +0000 (UTC)
Received: by mail-ee0-f42.google.com with SMTP id b47so5437eek.15
 for <multiple recipients>; Thu, 21 Feb 2013 14:00:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=x-received:sender:date:from:to:cc:subject:message-id:references
 :mime-version:content-type:content-disposition:in-reply-to
 :user-agent; bh=Z2ImSEKpK0R2wz/CvsGGkhjKN00K7LhHxJzOqCD1XHs=;
 b=KGJTyevA+Gp8t12+RiXOJ75ZbtfvNRtUfDQkrx2k4NdxaHV9A/7/r3/1D0u9RPmAMm
 DZmqSjh1N9SY/7U+zMhoxWP6msfKf7FLWiPbrBij2/oOl/0WjSz9Pf7WOJSSHrVL+giK
 I4jHdW6lqdPY7DEOFxrmyYS6Gwjx/nra6QIYHC3xbWlQPmGV+FkCa8F6k4srVRp0KGVK
 QWUUXubxzQOL9r45Yey2S+VcjjETZxbLU05R1NXuGp/aLtqEMlgeU/i7NQEEJEFqzSq4
 j3Wa3/Hgdi03y0BftqUkW9H8FB1j5zc1fsoctYM79fqKBPgHOqgEhCH6uB8MCCzT9Brd
 bsLw==
X-Received: by 10.14.204.3 with SMTP id g3mr85404371eeo.27.1361484047192;
 Thu, 21 Feb 2013 14:00:47 -0800 (PST)
Received: from localhost ([178.150.115.244])
 by mx.google.com with ESMTPS id m46sm36353eeo.16.2013.02.21.14.00.45
 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Thu, 21 Feb 2013 14:00:46 -0800 (PST)
Sender: Mikolaj Golub <to.my.trociny@gmail.com>
Date: Fri, 22 Feb 2013 00:00:43 +0200
From: Mikolaj Golub <trociny@FreeBSD.org>
To: Chad M Stewart <cms@balius.com>
Subject: Re: HAST - detect failure and restore avoiding an outage?
Message-ID: <20130221220042.GA2900@gmail.com>
References: <E3C8C9A2-712E-4925-995A-0471CCD3515B@balius.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <E3C8C9A2-712E-4925-995A-0471CCD3515B@balius.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: Pawel Jakub Dawidek <pjd@FreeBSD.org>, freebsd-questions@freebsd.org
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Feb 2013 22:00:49 -0000

On Wed, Feb 20, 2013 at 02:54:54PM -0600, Chad M Stewart wrote:
> 
> I built a 2 node cluster for testing HAST out.  Each node is an older HP server with 6 scsi disks.  Each disk is configured as RAID 0 in the raid controller, I wanted a JBOD to be presented to FreeBSD 9.1 x86.  I allocated a single disk for the OS, and the other 5 disks for HAST.
> 
> node2# zpool status
>   pool: scsi-san
>  state: ONLINE
>   scan: scrub repaired 0 in 0h27m with 0 errors on Tue Feb 19 17:38:55 2013
> config:
> 
> 	NAME            STATE     READ WRITE CKSUM
> 	scsi-san        ONLINE       0     0     0
> 	  raidz1-0      ONLINE       0     0     0
> 	    hast/disk1  ONLINE       0     0     0
> 	    hast/disk2  ONLINE       0     0     0
> 	    hast/disk3  ONLINE       0     0     0
> 	    hast/disk4  ONLINE       0     0     0
> 	    hast/disk5  ONLINE       0     0     0
> 
> 
>   pool: zroot
>  state: ONLINE
>   scan: none requested
> config:
> 
> 	NAME         STATE     READ WRITE CKSUM
> 	zroot        ONLINE       0     0     0
> 	  gpt/disk0  ONLINE       0     0     0
> 
> 
> 
> Yesterday I physically pulled disk2 (from node1) out to simulate a
> failure.  ZFS didn't see anything wrong, expected.  hastd did see
> the problem, expected.  'hastctl status' didn't show me anything
> unusual or indicate any problem that I could see on either node.  I
> saw hastd reporting problems in the logs, otherwise everything
> looked fine.  Is there a way to detect a failed disk from hastd
> besides the log?  camcontrol showed the disk had failed and
> obviously I'll be monitoring using it as well.

It looks currently logs are only way to detect errors from hastd side.
Here is a patch that adds local i/o error statistics, accessable avia
hastctl:

http://people.freebsd.org/~trociny/hast.stat_error.1.patch

hastctl output:

  role: secondary
  provname: test
  localpath: /dev/md102
  extentsize: 2097152 (2.0MB)
  keepdirty: 0
  remoteaddr: kopusha:7771
  replication: memsync
  status: complete
  dirty: 0 (0B)
  statistics:
    reads: 0
    writes: 366
    deletes: 0
    flushes: 0
    activemap updates: 0
    local i/o errors: 269

Pawel, what do you think about this patch?

> For recovery I installed a new disk in the same slot.  To protect
> the data reliability the safest way I can think of to recover is to
> do the following:
> 
> 1 - node1 - stop the apps
> 2 - node1 - export pool
> 3 - node1 - hastctl create disk2
> 4 - node1 - for D in 1 2 3 4 5; do hastctl role secondary;done
> 5 - node2 - for D in 1 2 3 4 5; do hastctl role primary;done
> 6 - node2 - import pool
> 7 - node2 - start the apps

> At step 5 the hastd will start to resynchronize node2:disk2 ->
> node1:disk2.  I've been trying to think of a way to re-establish the
> mirror without having to restart/move the pool _and_ not pose
> additional risk of data loss.
>
> To avoid an application outage I suppose the following would work:
>
> 1 - insert new disk in node1
> 2 - hastctl role init disk2
> 3 - hastctl create disk2
> 4 - hastctl role primary disk2
>
> At that point ZFS would have seen a disk failure and then started
> resilvering the pool. No application outage, but now only 4 disks
> contain the data (assuming changing bits on the pool, not static
> content).  Using the previous steps application outage, but a
> healthy pool is maintained always.

> Is there another scenario I'm thinking of where both data health and
> no application outage could be achieved?
>

-- 
Mikolaj Golub