From owner-freebsd-current@FreeBSD.ORG Tue Jul 14 22:39:55 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6EA61065672 for ; Tue, 14 Jul 2009 22:39:55 +0000 (UTC) (envelope-from freebsd-current@chrishedley.com) Received: from atmail-6.bnguk.net (atmail-6.bnguk.net [80.74.253.20]) by mx1.freebsd.org (Postfix) with ESMTP id 5C02A8FC14 for ; Tue, 14 Jul 2009 22:39:55 +0000 (UTC) (envelope-from freebsd-current@chrishedley.com) Received: from 53-233.adsl.zetnet.co.uk ([194.247.53.233] helo=mail.chrishedley.com) by atmail-6.bnguk.net with esmtp (Exim 4.69) (envelope-from ) id 1MQqfB-0007Ep-Gn for freebsd-current@freebsd.org; Tue, 14 Jul 2009 23:39:53 +0100 Received: from localhost (localhost [127.0.0.1]) by mail.chrishedley.com (Postfix) with ESMTP id BA7EAC10587 for ; Tue, 14 Jul 2009 23:39:51 +0100 (BST) X-Virus-Scanned: amavisd-new at chrishedley.com Received: from mail.chrishedley.com ([127.0.0.1]) by localhost (mail.chrishedley.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 1u6TKVWCLZcq for ; Tue, 14 Jul 2009 23:39:47 +0100 (BST) Received: from teapot.cbhnet (teapot.cbhnet [192.168.1.1]) by mail.chrishedley.com (Postfix) with ESMTP id B5BC5C10586 for ; Tue, 14 Jul 2009 23:39:47 +0100 (BST) Date: Tue, 14 Jul 2009 23:39:47 +0100 (BST) From: Chris Hedley X-X-Sender: cbh@teapot.cbhnet To: freebsd-current@freebsd.org In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: ZFS pool corrupted on upgrade of -current (probably sata renaming) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jul 2009 22:39:56 -0000 [A short summary in advance of my rambling: it seems that my ZFS pool got upset with the sata drive IDs changing and nearly broke. I /assume/ this hasn't been discussed, I did look, but please accept my apologies in advance if it's already a known issue] I sent a rather panicked message about this yesterday; fortunately I sent it to the wrong address so I'll send a slightly more sober version of the same today. :) I experienced a rather worrying problem when updating from my c. Feb 2009 version of -current to a recent build in that my ZFS pool was quite badly affected. Fortunately it hasn't /actually/ lost any data (yet) but I think I've been lucky in that regard and I do feel like the Sword of Damocles is hanging over me until I've moved it somewhere safe(r). In more detail, I had a raidz2 pool spread across eight of my 10 sata discs, using the same "h" partition of the BSD table I'd installed in "dangerously dedicated" mode. This had been working fine since the outset, also surviving the ZFS update around the beginning of the year with no problems. This time, however, things got extremely hairy: two of the component discs disappeared altogether, ad12 and ad22 in the new parlance, which would appear to be ad4 and ad6 in the old. This is perhaps significant as the two discs using the names ad4 and ad6 in the new nomenclature, formerly ad1 and ad2 respectively, were also reporting IO errors--I thought I'd had it as there's no way a raidz2 can survive four disc failures, but perhaps significantly ad4 and ad6 are the two drive names shared between the old and the new numbering schemes--as mentioned, the "missing" discs, ad12 and ad22 being the "old" ad4 and ad6; I'm probably explaining this badly, so here's a table of the old and new names: disc old new ---- --- --- disc 1: ad0 ad4 - IO errors on "new" ad4 disc 2: ad1 ad6 - IO errors on "new" ad6 disc 3: ad2 ad8 disc 4: ad3 ad10 disc 5: ad4 ad12 - "old" ad4 (now ad12) removed from pool disc 6: ad5 ad20 disc 7: ad6 ad22 - "old" ad6 (now ad22) removed from pool disc 8: ad7 ad24 In writing this down I think I can see clearly what the problem was, though I've been unable to find any mention of how to get ZFS to adapt to the drive names changing (maybe it's more obvious to ZFS veterans, but I'm not one of them!) At present I'm moving my data off the ZFS array before it totally confuses itself and eats my stuff, and enjoying the feeling of being rather cold and clammy while my data's on non-redundant drives for the first time in years. I'll probably use a couple of big and simple gmirror arrays in the short term but I'd like to rebuild my ZFS pool without worrying about the same thing happening again; could anyone offer suggestions, or perhaps make ZFS a bit less dependent on FreeBSD's idea of what a disc is called, or at least point me at something I should've read that might have avoided all this stuff happening in the first place...? Thanks, Chris.