From owner-freebsd-fs@FreeBSD.ORG Sun Feb 19 13:28:42 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CDE5F1065672 for ; Sun, 19 Feb 2012 13:28:42 +0000 (UTC) (envelope-from shuey@fmepnet.org) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 862648FC17 for ; Sun, 19 Feb 2012 13:28:42 +0000 (UTC) Received: by vcmm1 with SMTP id m1so4529345vcm.13 for ; Sun, 19 Feb 2012 05:28:42 -0800 (PST) Received-SPF: pass (google.com: domain of shuey@fmepnet.org designates 10.220.153.201 as permitted sender) client-ip=10.220.153.201; Authentication-Results: mr.google.com; spf=pass (google.com: domain of shuey@fmepnet.org designates 10.220.153.201 as permitted sender) smtp.mail=shuey@fmepnet.org Received: from mr.google.com ([10.220.153.201]) by 10.220.153.201 with SMTP id l9mr9414537vcw.1.1329658122003 (num_hops = 1); Sun, 19 Feb 2012 05:28:42 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.153.201 with SMTP id l9mr7513432vcw.1.1329658121879; Sun, 19 Feb 2012 05:28:41 -0800 (PST) Received: by 10.220.64.141 with HTTP; Sun, 19 Feb 2012 05:28:41 -0800 (PST) X-Originating-IP: [98.223.59.225] In-Reply-To: <1329595563.42839.28.camel@btw.pki2.com> References: <1329595563.42839.28.camel@btw.pki2.com> Date: Sun, 19 Feb 2012 08:28:41 -0500 Message-ID: From: Michael Shuey To: dg17@penx.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQlCgPRFZRZWo2NR7plZIN8OUDHbfHc18HVr1UeQMd7C7zwh1sfPQP9B0u15lwQAkesyGJfO Cc: freebsd-fs@freebsd.org Subject: Re: ZFS size reduced, 100% full, on fbsd9 upgrade X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Feb 2012 13:28:42 -0000 Okay, today's lesson: When you replace a disk with a bigger drive, and it increases your raidz2's pool capacity, ALWAYS run a zpool scrub before doing anything else. I rebooted back to 8.2p6, ran a (somewhat longer than normal) scrub, rebooted, then booted back to 9.0. Seems fine now, and is finishing its freebsd-update. Weird....but at least it works. On Sat, Feb 18, 2012 at 3:06 PM, Dennis Glatting wrote: > I'm not a ZFS wiz but... > > > On Sat, 2012-02-18 at 10:25 -0500, Michael Shuey wrote: >> I'm upgrading a server from 8.2p6 to 9.0-RELEASE, and I've tried both >> make in the source tree and freebsd-update and I get the same strange >> result. =A0As soon as I boot to the fbsd9 kernel, even booting into >> single-user mode, the pool's size is greatly reduced. =A0All filesystems >> show 100% full (0 bytes free space), nothing can be written to the >> pool (probably a side-effect of being 100% full), and dmesg shows >> several of "Solaris: WARNING: metaslab_free_dva(): bad DVA >> 0:5978620460544" warnings (with different numbers). =A0Switching kernels >> back to the 8.2p6 kernel restores things to normal, but I'd really >> like to finish my fbsd9 upgrade. >> >> The system is a 64-bit Intel box with 4 GB of memory, and 8 disks in a >> raidz2 pool called "pool". =A0It's booted to the 8.2p6 kernel now, and >> scrubbing the pool, but last time I did this (roughly a week ago) it >> was fine. =A0/ is a gmirror, but /usr, /tmp, and /var all come from the >> pool. =A0Normally, the pool has 1.2 TB of free space, and is version 15 >> (zfs version 4). =A0Some disks are WD drives, with 4k native sectors, >> but some time ago I rebuilt the pool to use a native 4k sector size >> (ashift=3D12). >> > > I believe 4GB of memory is the minimum. More is better. When you use the > minimum of anything, expect dodginess. > > You should upgrade your pool -- bug fixes and all that. > > Are all the disks 4k sectors? I found that a mix of 512 and 4k work but > performance is best when they are all the same. I have also found 512 > emulation isn't a believable choice when looking at performance (i.e., > set for 4k). > > Different people have different opinions but I personally do not use ZFS > for the OS, rather I RAID1 the OS. The question you have to ask is > if /usr goes kablewie whether you have he skills to put it back > together. I do not, so "simple" (i.e., hardware RAID1) for the OS is > good for me -- it isn't the OS that's being worked in my setups, rather > the data areas. > > >> Over time, I've been slowly replacing disks (1 at a time) to increase >> the free space in the pool. =A0Also, the system experienced severe >> failure recently; the power supply blew, and took out the memory (and >> presumably motherboard). =A0I replaced these last week with known-good >> board/memory/processor/PS, and it's been running fine since. >> > > Expect mixed results with mixed disks, at least from my experience, > particularly when it comes to performance. > > Is the MB the same? I have had mixed results. I find the Gigabyte boards > work well but ASUS dodgy when it comes to high interrupt handling. > Server boards with ECC memory are the most reliable. > > >> Any suggestions? =A0Is it possible I've got some nasty pool corruption >> going on - and if so, how do I go about fixing it? =A0Any advice would >> be appreciated. =A0This is a backup server, so I could rebuild its >> contents from the primary, but I'd rather fix it if possible (since I >> want to do a fbsd9 upgrade on the primary next). > > I screw around with my set ups. What I found is rebuilding the pool > (when I screw it up) is the least troublesome approach. > > Recently I found a tray bad on one of my servers. Drove me nuts for two > weeks. It could be a loose cable, or bad cable, or crimped cable, but I > am not yet in the position to open the case. Most of my ZFS weirdnesses > have been hardware related. > > It could be your blowout impacted your disks or wiring. Do you SMART? I > found, generally, SMART is goodness but I presently have a question mark > when it comes to the Hitachi 4TB disks (I misbehaved on that system so > then issue could be my own; however on another system there wasn't any > errors). > > I have found, when I have multiple, identical controllers, that the same > firmware across the controllers is a good approach, otherwise weirdness > and different MBs manifest this problem in different ways. Also, make > sure your MB's BIOS is recent. > > YMMV > > > >