From owner-freebsd-fs@FreeBSD.ORG  Sun Jun 21 19:50:18 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 691AA10A
 for <freebsd-fs@hub.freebsd.org>; Sun, 21 Jun 2015 19:50:18 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: from mail-oi0-x230.google.com (mail-oi0-x230.google.com
 [IPv6:2607:f8b0:4003:c06::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 2E9B097
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 19:50:18 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: by oigx81 with SMTP id x81so109591111oig.1
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 12:50:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=L8a9ADRzLN8Rsm2g877jh9MfaJbptDYU8BEsnPlKCmo=;
 b=flwiczJEzUQ3ryLHHdZayrlC1l4zhvkmUgM5tR6I9OE1a7FSP98eTYB+P55eVtcBa1
 F8eigYnL2EsQ6dex2Vkik2tLcXUWjL31tOYHahb/fRU1j6qSgvb4OfYy9wXOTGiWzUYT
 O4vFdTPBIxxDq/ffWg+mxxN62j4A2fX5dfKrRCm4k+FbKIo4GsPg6xBoEGngkXmjNGJd
 Tc6qbkThjmZt7igPPCqvtelOM2Tg4cipooZ9/Osp3DBuKUPZpPh+WmCyiJpGtNl0iErL
 k7T8pt58E+Icl8nbQWAZ+I17JwYiIERsiV3wGKvfDbMEH9tGIwJtaZy1SHCEDWqyJEzI
 LJrQ==
MIME-Version: 1.0
X-Received: by 10.60.155.132 with SMTP id vw4mr8044581oeb.51.1434916217248;
 Sun, 21 Jun 2015 12:50:17 -0700 (PDT)
Received: by 10.202.77.138 with HTTP; Sun, 21 Jun 2015 12:50:17 -0700 (PDT)
In-Reply-To: <5586C396.9010100@digiware.nl>
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl>
Date: Sun, 21 Jun 2015 15:50:17 -0400
Message-ID: <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
From: Tom Curry <thomasrcurry@gmail.com>
To: Willem Jan Withagen <wjw@digiware.nl>
Cc: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 21 Jun 2015 19:50:18 -0000

Was there by chance a lot of disk activity going on when this occurred?

On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <wjw@digiware.nl>
wrote:

> On 20/06/2015 18:11, Daryl Richards wrote:
> > Check the failmode setting on your pool. From man zpool:
> >
> >        failmode=wait | continue | panic
> >
> >            Controls the system behavior in the event of catastrophic
> > pool failure.  This  condition  is  typically  a
> >            result  of  a  loss of connectivity to the underlying storage
> > device(s) or a failure of all devices within
> >            the pool. The behavior of such an event is determined as
> > follows:
> >
> >            wait        Blocks all I/O access until the device
> > connectivity is recovered and the errors  are  cleared.
> >                        This is the default behavior.
> >
> >            continue    Returns  EIO  to  any  new write I/O requests but
> > allows reads to any of the remaining healthy
> >                        devices. Any write requests that have yet to be
> > committed to disk would be blocked.
> >
> >            panic       Prints out a message to the console and generates
> > a system crash dump.
>
> 'mmm
>
> Did not know about this setting. Nice one, but alas my current setting is:
> zfsboot  failmode         wait                           default
> zfsraid  failmode         wait                           default
>
> So either the setting is not working, or something else is up?
> Is waiting only meant to wait a limited time? And then panic anyways?
>
> But then still I wonder why even in the 'continue'-case the ZFS system
> ends in a state where the filesystem is not able to continue in its
> standard functioning ( read and write ) and disconnects the disk???
>
> All failmode settings result in a seriously handicapped system...
> On a raidz2 system I would perhaps expected this to occur when the
> second disk goes into thin space??
>
> The other question is: The man page talks about
> 'Controls the system behavior in the event of catastrophic pool failure'
> And is a hung disk a 'catastrophic pool failure'?
>
> Still very puzzled?
>
> --WjW
>
> >
> >
> > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
> >> Hi,
> >>
> >> Found my system rebooted this morning:
> >>
> >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb 0xfffff8011b6da498: Listen
> >> queue overflow: 8 already in queue awaiting acceptance (48 occurrences)
> >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid' appears to be
> >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
> >> Jun 20 05:28:33 zfs kernel: cpuid = 0
> >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
> >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
> >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> >>
> >> Which leads me to believe that /dev/da0 went out on vacation, leaving
> >> ZFS into trouble.... But the array is:
> >> ----
> >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP
> >> zfsraid           32.5T  13.3T  19.2T         -     7%    41%  1.00x
> >> ONLINE  -
> >>    raidz2          16.2T  6.67T  9.58T         -     8%    41%
> >>      da0               -      -      -         -      -      -
> >>      da1               -      -      -         -      -      -
> >>      da2               -      -      -         -      -      -
> >>      da3               -      -      -         -      -      -
> >>      da4               -      -      -         -      -      -
> >>      da5               -      -      -         -      -      -
> >>    raidz2          16.2T  6.67T  9.58T         -     7%    41%
> >>      da6               -      -      -         -      -      -
> >>      da7               -      -      -         -      -      -
> >>      ada4              -      -      -         -      -      -
> >>      ada5              -      -      -         -      -      -
> >>      ada6              -      -      -         -      -      -
> >>      ada7              -      -      -         -      -      -
> >>    mirror           504M  1.73M   502M         -    39%     0%
> >>      gpt/log0          -      -      -         -      -      -
> >>      gpt/log1          -      -      -         -      -      -
> >> cache                 -      -      -      -      -      -
> >>    gpt/raidcache0   109G  1.34G   107G         -     0%     1%
> >>    gpt/raidcache1   109G   787M   108G         -     0%     0%
> >> ----
> >>
> >> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
> >> then switch to DEGRADED state and continue, letting the operator fix the
> >> broken disk.
> >> Instead it chooses to panic, which is not a nice thing to do. :)
> >>
> >> Or do I have to high hopes of ZFS?
> >>
> >> Next question to answer is why this WD RED on:
> >>
> >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3 chip=0x112017d3
> >> rev=0x00 hdr=0x00
> >>      vendor     = 'Areca Technology Corp.'
> >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID Controller'
> >>      class      = mass storage
> >>      subclass   = RAID
> >>
> >> got hung, and nothing for this shows in SMART....
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>