Date: Sun, 19 Feb 2012 08:28:41 -0500 From: Michael Shuey <shuey@fmepnet.org> To: dg17@penx.com Cc: freebsd-fs@freebsd.org Subject: Re: ZFS size reduced, 100% full, on fbsd9 upgrade Message-ID: <CAELRr5k%2BvuN8G2BRigFT4%2BpmLergbcn_ybOV%2BSQj7KGDE-FEOw@mail.gmail.com> In-Reply-To: <1329595563.42839.28.camel@btw.pki2.com> References: <CAELRr5kPXjqTooLbjPC1oPB3e2TfRC=eE%2Bzvsu-tW54Pz42xFg@mail.gmail.com> <1329595563.42839.28.camel@btw.pki2.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Okay, today's lesson: When you replace a disk with a bigger drive, and it increases your raidz2's pool capacity, ALWAYS run a zpool scrub <pool> before doing anything else. I rebooted back to 8.2p6, ran a (somewhat longer than normal) scrub, rebooted, then booted back to 9.0. Seems fine now, and is finishing its freebsd-update. Weird....but at least it works. On Sat, Feb 18, 2012 at 3:06 PM, Dennis Glatting <dg17@penx.com> wrote: > I'm not a ZFS wiz but... > > > On Sat, 2012-02-18 at 10:25 -0500, Michael Shuey wrote: >> I'm upgrading a server from 8.2p6 to 9.0-RELEASE, and I've tried both >> make in the source tree and freebsd-update and I get the same strange >> result. =A0As soon as I boot to the fbsd9 kernel, even booting into >> single-user mode, the pool's size is greatly reduced. =A0All filesystems >> show 100% full (0 bytes free space), nothing can be written to the >> pool (probably a side-effect of being 100% full), and dmesg shows >> several of "Solaris: WARNING: metaslab_free_dva(): bad DVA >> 0:5978620460544" warnings (with different numbers). =A0Switching kernels >> back to the 8.2p6 kernel restores things to normal, but I'd really >> like to finish my fbsd9 upgrade. >> >> The system is a 64-bit Intel box with 4 GB of memory, and 8 disks in a >> raidz2 pool called "pool". =A0It's booted to the 8.2p6 kernel now, and >> scrubbing the pool, but last time I did this (roughly a week ago) it >> was fine. =A0/ is a gmirror, but /usr, /tmp, and /var all come from the >> pool. =A0Normally, the pool has 1.2 TB of free space, and is version 15 >> (zfs version 4). =A0Some disks are WD drives, with 4k native sectors, >> but some time ago I rebuilt the pool to use a native 4k sector size >> (ashift=3D12). >> > > I believe 4GB of memory is the minimum. More is better. When you use the > minimum of anything, expect dodginess. > > You should upgrade your pool -- bug fixes and all that. > > Are all the disks 4k sectors? I found that a mix of 512 and 4k work but > performance is best when they are all the same. I have also found 512 > emulation isn't a believable choice when looking at performance (i.e., > set for 4k). > > Different people have different opinions but I personally do not use ZFS > for the OS, rather I RAID1 the OS. The question you have to ask is > if /usr goes kablewie whether you have he skills to put it back > together. I do not, so "simple" (i.e., hardware RAID1) for the OS is > good for me -- it isn't the OS that's being worked in my setups, rather > the data areas. > > >> Over time, I've been slowly replacing disks (1 at a time) to increase >> the free space in the pool. =A0Also, the system experienced severe >> failure recently; the power supply blew, and took out the memory (and >> presumably motherboard). =A0I replaced these last week with known-good >> board/memory/processor/PS, and it's been running fine since. >> > > Expect mixed results with mixed disks, at least from my experience, > particularly when it comes to performance. > > Is the MB the same? I have had mixed results. I find the Gigabyte boards > work well but ASUS dodgy when it comes to high interrupt handling. > Server boards with ECC memory are the most reliable. > > >> Any suggestions? =A0Is it possible I've got some nasty pool corruption >> going on - and if so, how do I go about fixing it? =A0Any advice would >> be appreciated. =A0This is a backup server, so I could rebuild its >> contents from the primary, but I'd rather fix it if possible (since I >> want to do a fbsd9 upgrade on the primary next). > > I screw around with my set ups. What I found is rebuilding the pool > (when I screw it up) is the least troublesome approach. > > Recently I found a tray bad on one of my servers. Drove me nuts for two > weeks. It could be a loose cable, or bad cable, or crimped cable, but I > am not yet in the position to open the case. Most of my ZFS weirdnesses > have been hardware related. > > It could be your blowout impacted your disks or wiring. Do you SMART? I > found, generally, SMART is goodness but I presently have a question mark > when it comes to the Hitachi 4TB disks (I misbehaved on that system so > then issue could be my own; however on another system there wasn't any > errors). > > I have found, when I have multiple, identical controllers, that the same > firmware across the controllers is a good approach, otherwise weirdness > and different MBs manifest this problem in different ways. Also, make > sure your MB's BIOS is recent. > > YMMV > > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAELRr5k%2BvuN8G2BRigFT4%2BpmLergbcn_ybOV%2BSQj7KGDE-FEOw>