From owner-freebsd-fs@FreeBSD.ORG Sat Jun 20 16:11:44 2015 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4E31ADA3 for ; Sat, 20 Jun 2015 16:11:44 +0000 (UTC) (envelope-from daryl@isletech.net) Received: from mail.isletech.net (mail.isletech.net [IPv6:2001:470:1d:c44::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 289E682D for ; Sat, 20 Jun 2015 16:11:44 +0000 (UTC) (envelope-from daryl@isletech.net) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=isletech.net; s=isle2014; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:To:MIME-Version:From:Date:Message-ID; bh=JNBaezXt2HzlR3AaUA0UMOzvY2J2n/2AGaNhtYqznbs=; b=Tm4g4IP9cE/lR/+fsBVSY0khhrXbJVio8vae/EUNBu105AJ0YZ055GjnaLme5MFtKiPqdZSqlwxcNRaNEc9DL/1fMt6DaVZ4SfrsOwajc4TBJYXv0+gDkZ8bFQ8DU+sMVIZ6sO18ccFEZM8Ztxm5K5GKlxAcK30s2T8cpiwIe5A=; Message-ID: <558590BD.40603@isletech.net> Date: Sat, 20 Jun 2015 12:11:41 -0400 From: Daryl Richards User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: This diskfailure should not panic a system, but just disconnect disk from ZFS References: <5585767B.4000206@digiware.nl> In-Reply-To: <5585767B.4000206@digiware.nl> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Jun 2015 16:11:44 -0000 Check the failmode setting on your pool. From man zpool: failmode=wait | continue | panic Controls the system behavior in the event of catastrophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows: wait Blocks all I/O access until the device connectivity is recovered and the errors are cleared. This is the default behavior. continue Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked. panic Prints out a message to the console and generates a system crash dump. On 2015-06-20 10:19 AM, Willem Jan Withagen wrote: > Hi, > > Found my system rebooted this morning: > > Jun 20 05:28:33 zfs kernel: sonewconn: pcb 0xfffff8011b6da498: Listen > queue overflow: 8 already in queue awaiting acceptance (48 occurrences) > Jun 20 05:28:33 zfs kernel: panic: I/O to pool 'zfsraid' appears to be > hung on vdev guid 18180224580327100979 at '/dev/da0'. > Jun 20 05:28:33 zfs kernel: cpuid = 0 > Jun 20 05:28:33 zfs kernel: Uptime: 8d9h7m9s > Jun 20 05:28:33 zfs kernel: Dumping 6445 out of 8174 > MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Which leads me to believe that /dev/da0 went out on vacation, leaving > ZFS into trouble.... But the array is: > ---- > NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP > zfsraid 32.5T 13.3T 19.2T - 7% 41% 1.00x > ONLINE - > raidz2 16.2T 6.67T 9.58T - 8% 41% > da0 - - - - - - > da1 - - - - - - > da2 - - - - - - > da3 - - - - - - > da4 - - - - - - > da5 - - - - - - > raidz2 16.2T 6.67T 9.58T - 7% 41% > da6 - - - - - - > da7 - - - - - - > ada4 - - - - - - > ada5 - - - - - - > ada6 - - - - - - > ada7 - - - - - - > mirror 504M 1.73M 502M - 39% 0% > gpt/log0 - - - - - - > gpt/log1 - - - - - - > cache - - - - - - > gpt/raidcache0 109G 1.34G 107G - 0% 1% > gpt/raidcache1 109G 787M 108G - 0% 0% > ---- > > And thus I'd would have expected that ZFS would disconnect /dev/da0 and > then switch to DEGRADED state and continue, letting the operator fix the > broken disk. > Instead it chooses to panic, which is not a nice thing to do. :) > > Or do I have to high hopes of ZFS? > > Next question to answer is why this WD RED on: > > arcmsr0@pci0:7:14:0: class=0x010400 card=0x112017d3 chip=0x112017d3 > rev=0x00 hdr=0x00 > vendor = 'Areca Technology Corp.' > device = 'ARC-1120 8-Port PCI-X to SATA RAID Controller' > class = mass storage > subclass = RAID > > got hung, and nothing for this shows in SMART.... > > > Thanx, > --WjW > > (If needed vmcore available) > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- Daryl Richards Isle Technical Services Inc.