From owner-freebsd-fs@FreeBSD.ORG  Mon Jun 22 00:57:22 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@nevdull.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 06FE87D4
 for <freebsd-fs@nevdull.freebsd.org>; Mon, 22 Jun 2015 00:57:22 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: from mail-oi0-x235.google.com (mail-oi0-x235.google.com
 [IPv6:2607:f8b0:4003:c06::235])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6D707668
 for <freebsd-fs@freebsd.org>; Mon, 22 Jun 2015 00:57:21 +0000 (UTC)
 (envelope-from thomasrcurry@gmail.com)
Received: by oigb199 with SMTP id b199so70083251oig.3
 for <freebsd-fs@freebsd.org>; Sun, 21 Jun 2015 17:57:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=ciGPUTCfvAP5nIIsgQvFQ5r+Kj4lpYwTsH7ktql4oqM=;
 b=gtI7WfHf0MvGceYSBQ2CdSVBJeq90e6790k7DGRE7psTjofmn/ayXMAZlOOdZ6/GwL
 mBnxmBgr1WmFDeb0army6aFnD/dRfibnlOpYuWWcd1++VzozWdleUpc6YYf7/LMGUi75
 GG0x0yDxp2x7xEeLcxBGj/X4OmKNAe3wpQg5RoBuJcqar6OSGK8XfQDGVBuqOZQFDUEt
 73oT9SqYjm1aVNUsJ9euxQCzCbsxWXrqq6nDdpN2JnF8Znpc/HJGysxIwdeFlMB3BkBq
 RWy5uAE2QZDbMxjZR4xHxROSYwagp+wSyC2s+GsbgDyF3hV6btTU6OobwWB5ATybDfAu
 dXKg==
MIME-Version: 1.0
X-Received: by 10.60.118.193 with SMTP id ko1mr7671514oeb.38.1434932779902;
 Sun, 21 Jun 2015 17:26:19 -0700 (PDT)
Received: by 10.202.77.138 with HTTP; Sun, 21 Jun 2015 17:26:19 -0700 (PDT)
In-Reply-To: <55874C8A.4090405@digiware.nl>
References: <5585767B.4000206@digiware.nl> <558590BD.40603@isletech.net>
 <5586C396.9010100@digiware.nl>
 <CAGtEZUAO5-rBoz0YBcYfvZ6tx_sj0MEFuxGSYk+z0XHrJySk2A@mail.gmail.com>
 <55873E1D.9010401@digiware.nl>
 <CAGtEZUBexzwjTGMXY+Mg5knNsC+f35TXhAqhL0vdOKoOUO1F3A@mail.gmail.com>
 <55874C8A.4090405@digiware.nl>
Date: Sun, 21 Jun 2015 20:26:19 -0400
Message-ID: <CAGtEZUBx=EwM7bx5VL+pn9AsV-66mAbXbCjQq23=E2zAJnEtMA@mail.gmail.com>
Subject: Re: This diskfailure should not panic a system, but just disconnect
 disk from ZFS
From: Tom Curry <thomasrcurry@gmail.com>
To: Willem Jan Withagen <wjw@digiware.nl>
Cc: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.20
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jun 2015 00:57:22 -0000

Yes, currently I am not using the patch from that PR. But I have lowered
the ARC max size, I am confident if I left it default I would have panics
again.


On Sun, Jun 21, 2015 at 7:45 PM, Willem Jan Withagen <wjw@digiware.nl>
wrote:

> On 22/06/2015 01:34, Tom Curry wrote:
> > I asked because recently I had similar trouble. Lots of kernel panics,
> > sometimes they were just like yours, sometimes they were general
> > protection faults. But they would always occur when my nightly backups
> > took place where VMs on iSCSI zvol luns were read and then written over
> > smb to another pool on the same machine over 10GbE.
> >
> > I nearly went out of my mind trying to figure out what was going on,
> > I'll spare you the gory details, but I stumbled across this PR
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 and as I read
>
> So this is "the Karl Denninger ZFS patch"....
> I tried to follow the discussion at the moment, keeping it in the back
> of my head.....
> I concluded that the ideas where sort of accepted, but a different
> solution was implemented?
>
> > through it little light bulbs starting coming on. Luckily it was easy
> > for me to reproduce the problem so I kicked off the backups and watched
> > the system memory. Wired would grow, ARC would shrink, and then the
> > system would start swapping. If I stopped the IO right then it would
> > recover after a while. But if I let it go it would always panic, and
> > half the time it would be the same message as yours. So I applied the
> > patch from that PR, rebooted, and kicked off the backup. No more panic.
> > Recently I rebuilt a vanilla kernel from stable/10 but explicitly set
> > vfs.zfs.arc_max to 24G (I have 32G) and ran my torture tests and it is
> > stable.
>
> So you've (almost) answered my question, but English is not my native
> language and hence my question for certainty: You did not add the patch
> to your recently build stable/10 kernel...
>
> > So I don't want to send you on a wild goose chase, but it's entirely
> > possible this problem you are having is not hardware related at all, but
> > is a memory starvation issue related to the ARC under periods of heavy
> > activity.
>
> Well rsync will do that for you... And since a few months I've also
> loaded some iSCSI zvols as remote disks to some windows stations.
>
> Your suggestions are highly appreciated. Especially since I do not have
> space PCI-X parts... (It the current hardware blows up, I'm getting
> monder new stuff.) So other than checking some cabling and likes there
> is very little I could swap.
>
> Thanx,
> --WjW
>
> > On Sun, Jun 21, 2015 at 6:43 PM, Willem Jan Withagen <wjw@digiware.nl
> > <mailto:wjw@digiware.nl>> wrote:
> >
> >     On 21/06/2015 21:50, Tom Curry wrote:
> >     > Was there by chance a lot of disk activity going on when this
> occurred?
> >
> >     Define 'a lot'??
> >     But very likely, since the system is also a backup location for
> several
> >     external service which backup thru rsync. And they can generate
> generate
> >     quite some traffic. Next to the fact that it also serves a NVR with a
> >     ZVOL trhu iSCSI...
> >
> >     --WjW
> >
> >     >
> >     > On Sun, Jun 21, 2015 at 10:00 AM, Willem Jan Withagen <
> wjw@digiware.nl <mailto:wjw@digiware.nl>
> >     > <mailto:wjw@digiware.nl <mailto:wjw@digiware.nl>>> wrote:
> >     >
> >     >     On 20/06/2015 18:11, Daryl Richards wrote:
> >     >     > Check the failmode setting on your pool. From man zpool:
> >     >     >
> >     >     >        failmode=wait | continue | panic
> >     >     >
> >     >     >            Controls the system behavior in the event of
> >     catastrophic
> >     >     > pool failure.  This  condition  is  typically  a
> >     >     >            result  of  a  loss of connectivity to the
> >     underlying storage
> >     >     > device(s) or a failure of all devices within
> >     >     >            the pool. The behavior of such an event is
> >     determined as
> >     >     > follows:
> >     >     >
> >     >     >            wait        Blocks all I/O access until the device
> >     >     > connectivity is recovered and the errors  are  cleared.
> >     >     >                        This is the default behavior.
> >     >     >
> >     >     >            continue    Returns  EIO  to  any  new write I/O
> >     requests but
> >     >     > allows reads to any of the remaining healthy
> >     >     >                        devices. Any write requests that have
> >     yet to be
> >     >     > committed to disk would be blocked.
> >     >     >
> >     >     >            panic       Prints out a message to the console
> >     and generates
> >     >     > a system crash dump.
> >     >
> >     >     'mmm
> >     >
> >     >     Did not know about this setting. Nice one, but alas my current
> >     >     setting is:
> >     >     zfsboot  failmode         wait
>  default
> >     >     zfsraid  failmode         wait
>  default
> >     >
> >     >     So either the setting is not working, or something else is up?
> >     >     Is waiting only meant to wait a limited time? And then panic
> >     anyways?
> >     >
> >     >     But then still I wonder why even in the 'continue'-case the
> >     ZFS system
> >     >     ends in a state where the filesystem is not able to continue
> >     in its
> >     >     standard functioning ( read and write ) and disconnects the
> >     disk???
> >     >
> >     >     All failmode settings result in a seriously handicapped
> system...
> >     >     On a raidz2 system I would perhaps expected this to occur when
> the
> >     >     second disk goes into thin space??
> >     >
> >     >     The other question is: The man page talks about
> >     >     'Controls the system behavior in the event of catastrophic
> >     pool failure'
> >     >     And is a hung disk a 'catastrophic pool failure'?
> >     >
> >     >     Still very puzzled?
> >     >
> >     >     --WjW
> >     >
> >     >     >
> >     >     >
> >     >     > On 2015-06-20 10:19 AM, Willem Jan Withagen wrote:
> >     >     >> Hi,
> >     >     >>
> >     >     >> Found my system rebooted this morning:
> >     >     >>
> >     >     >> Jun 20 05:28:33 zfs kernel: sonewconn: pcb
> >     0xfffff8011b6da498: Listen
> >     >     >> queue overflow: 8 already in queue awaiting acceptance (48
> >     >     occurrences)
> >     >     >> Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid'
> >     appears
> >     >     to be
> >     >     >> hung on vdev guid 18180224580327100979 at '/dev/da0'.
> >     >     >> Jun 20 05:28:33 zfs kernel: cpuid = 0
> >     >     >> Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s
> >     >     >> Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174
> >     >     >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> >     >     >>
> >     >     >> Which leads me to believe that /dev/da0 went out on
> >     vacation, leaving
> >     >     >> ZFS into trouble.... But the array is:
> >     >     >> ----
> >     >     >> NAME               SIZE  ALLOC   FREE  EXPANDSZ   FRAG
> >     CAP  DEDUP
> >     >     >> zfsraid           32.5T  13.3T  19.2T         -     7%
> >     41%  1.00x
> >     >     >> ONLINE  -
> >     >     >>    raidz2          16.2T  6.67T  9.58T         -     8%
> 41%
> >     >     >>      da0               -      -      -         -      -
>   -
> >     >     >>      da1               -      -      -         -      -
>   -
> >     >     >>      da2               -      -      -         -      -
>   -
> >     >     >>      da3               -      -      -         -      -
>   -
> >     >     >>      da4               -      -      -         -      -
>   -
> >     >     >>      da5               -      -      -         -      -
>   -
> >     >     >>    raidz2          16.2T  6.67T  9.58T         -     7%
> 41%
> >     >     >>      da6               -      -      -         -      -
>   -
> >     >     >>      da7               -      -      -         -      -
>   -
> >     >     >>      ada4              -      -      -         -      -
>   -
> >     >     >>      ada5              -      -      -         -      -
>   -
> >     >     >>      ada6              -      -      -         -      -
>   -
> >     >     >>      ada7              -      -      -         -      -
>   -
> >     >     >>    mirror           504M  1.73M   502M         -    39%
>  0%
> >     >     >>      gpt/log0          -      -      -         -      -
>   -
> >     >     >>      gpt/log1          -      -      -         -      -
>   -
> >     >     >> cache                 -      -      -      -      -      -
> >     >     >>    gpt/raidcache0   109G  1.34G   107G         -     0%
>  1%
> >     >     >>    gpt/raidcache1   109G   787M   108G         -     0%
>  0%
> >     >     >> ----
> >     >     >>
> >     >     >> And thus I'd would have expected that ZFS would disconnect
> >     >     /dev/da0 and
> >     >     >> then switch to DEGRADED state and continue, letting the
> >     operator
> >     >     fix the
> >     >     >> broken disk.
> >     >     >> Instead it chooses to panic, which is not a nice thing to
> >     do. :)
> >     >     >>
> >     >     >> Or do I have to high hopes of ZFS?
> >     >     >>
> >     >     >> Next question to answer is why this WD RED on:
> >     >     >>
> >     >     >> arcmsr0@pci0:7:14:0:    class=0x010400 card=0x112017d3
> >     >     chip=0x112017d3
> >     >     >> rev=0x00 hdr=0x00
> >     >     >>      vendor     = 'Areca Technology Corp.'
> >     >     >>      device     = 'ARC-1120 8-Port PCI-X to SATA RAID
> >     Controller'
> >     >     >>      class      = mass storage
> >     >     >>      subclass   = RAID
> >     >     >>
> >     >     >> got hung, and nothing for this shows in SMART....
> >     >
> >     >     _______________________________________________
> >     >     freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>
> >     <mailto:freebsd-fs@freebsd.org <mailto:freebsd-fs@freebsd.org>>
> >     mailing list
> >     >     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> >     >     To unsubscribe, send any mail to "
> freebsd-fs-unsubscribe@freebsd.org
> >     <mailto:freebsd-fs-unsubscribe@freebsd.org>
> >     >     <mailto:freebsd-fs-unsubscribe@freebsd.org
> >     <mailto:freebsd-fs-unsubscribe@freebsd.org>>"
> >     >
> >     >
> >
> >
>
>