From owner-freebsd-stable@FreeBSD.ORG Mon Jul 19 16:15:23 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 638851065674 for ; Mon, 19 Jul 2010 16:15:23 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 290BF8FC0A for ; Mon, 19 Jul 2010 16:15:22 +0000 (UTC) Received: by iwn35 with SMTP id 35so5970339iwn.13 for ; Mon, 19 Jul 2010 09:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=yOtogE4122l2OxVnolaUJ8h5ql8ONfEXwx7/qo3920Q=; b=S39FJK/sDOuX6GscMEtip/luh6B3WPIkh+Sbp2L2yyJG1Q95cZRNG8az0lF8rzMQgX 8sZDVu/EQkrOr+ZfRxm6Rk8Ds9PtBGWGh4bdti7XAyev6HlUGiTy+31guyL0XoXzum8K 28KBcYjtu1oapi9AOMUtSt0sCJEffY0O6OOsY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Ak4wMBUPFHl9n5WQFg745CErZQrG03YKh/KtVpRy1A51Q5MFradUqwsALNj9RqcMI9 KPJF0C9xDs55UKw1smAB9s1U8+adqrvr3LRYvyNK9JDCOBnWEsOzS+UQEz8DmWw02KdX vgp51dEQr8NOZ/8YRkJ+RVqVOGNBaQKsrxi7o= MIME-Version: 1.0 Received: by 10.231.144.15 with SMTP id x15mr5356517ibu.73.1279556122433; Mon, 19 Jul 2010 09:15:22 -0700 (PDT) Received: by 10.231.161.208 with HTTP; Mon, 19 Jul 2010 09:15:21 -0700 (PDT) In-Reply-To: References: Date: Mon, 19 Jul 2010 09:15:21 -0700 Message-ID: From: Freddie Cash To: freebsd-stable Content-Type: text/plain; charset=UTF-8 Subject: Re: Problems replacing failing drive in ZFS pool X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jul 2010 16:15:23 -0000 On Mon, Jul 19, 2010 at 8:56 AM, Garrett Moore wrote: > So you think it's because when I switch from the old disk to the new disk, > ZFS doesn't realize the disk has changed, and thinks the data is just > corrupt now? Even if that happens, shouldn't the pool still be available, > since it's RAIDZ1 and only one disk has gone away? I think it's because you pull the old drive, boot with the new drive, the controller re-numbers all the devices (ie da3 is now da2, da2 is now da1, da1 is now da0, da0 is now da6, etc), and ZFS thinks that all the drives have changed, thus corrupting the pool. I've had this happen on our storage servers a couple of times before I started using glabel(8) on all our drives (dead drive on RAID controller, remove drive, reboot for whatever reason, all device nodes are renumbered, everything goes kablooey). Doing the export and import will force ZFS to re-read the metadata on the drives (ZFS does it's own "labelling" to say which drives belong to which vdevs), and to pick things up correctly using the new device nodes. > I don't have / on ZFS; I'm only using it as a 'data' partition, so I should > be able to try your suggestion. My only concern: is there any risk of > trashing my pool if I try your instructions? Everything I've done so far, > even when told "insufficient replicas / corrupt data", has not cost me any > data as long as I switch back to the original (dying) drive. If I mix in > export/import statements which might 'touch' the pool, is there a chance it > will choke and trash my data? Well, there's always a chance things explode. :) But an export/import is safe so long as all drives are connected at the time. I've recovered "corrupted" pools by doing the above. (I've now switched to labelling all my drives to prevent this from happening.) Of course, always have good backups. ;) -- Freddie Cash fjwcash@gmail.com