From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 19 13:28:42 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CDE5F1065672
	for <freebsd-fs@freebsd.org>; Sun, 19 Feb 2012 13:28:42 +0000 (UTC)
	(envelope-from shuey@fmepnet.org)
Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com
	[209.85.220.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 862648FC17
	for <freebsd-fs@freebsd.org>; Sun, 19 Feb 2012 13:28:42 +0000 (UTC)
Received: by vcmm1 with SMTP id m1so4529345vcm.13
	for <freebsd-fs@freebsd.org>; Sun, 19 Feb 2012 05:28:42 -0800 (PST)
Received-SPF: pass (google.com: domain of shuey@fmepnet.org designates
	10.220.153.201 as permitted sender) client-ip=10.220.153.201; 
Authentication-Results: mr.google.com;
	spf=pass (google.com: domain of shuey@fmepnet.org
	designates 10.220.153.201 as permitted sender)
	smtp.mail=shuey@fmepnet.org
Received: from mr.google.com ([10.220.153.201])
	by 10.220.153.201 with SMTP id l9mr9414537vcw.1.1329658122003 (num_hops
	= 1); Sun, 19 Feb 2012 05:28:42 -0800 (PST)
MIME-Version: 1.0
Received: by 10.220.153.201 with SMTP id l9mr7513432vcw.1.1329658121879; Sun,
	19 Feb 2012 05:28:41 -0800 (PST)
Received: by 10.220.64.141 with HTTP; Sun, 19 Feb 2012 05:28:41 -0800 (PST)
X-Originating-IP: [98.223.59.225]
In-Reply-To: <1329595563.42839.28.camel@btw.pki2.com>
References: <CAELRr5kPXjqTooLbjPC1oPB3e2TfRC=eE+zvsu-tW54Pz42xFg@mail.gmail.com>
	<1329595563.42839.28.camel@btw.pki2.com>
Date: Sun, 19 Feb 2012 08:28:41 -0500
Message-ID: <CAELRr5k+vuN8G2BRigFT4+pmLergbcn_ybOV+SQj7KGDE-FEOw@mail.gmail.com>
From: Michael Shuey <shuey@fmepnet.org>
To: dg17@penx.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQlCgPRFZRZWo2NR7plZIN8OUDHbfHc18HVr1UeQMd7C7zwh1sfPQP9B0u15lwQAkesyGJfO
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS size reduced, 100% full, on fbsd9 upgrade
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Feb 2012 13:28:42 -0000

Okay, today's lesson: When you replace a disk with a bigger drive, and
it increases your raidz2's pool capacity, ALWAYS run a zpool scrub
<pool> before doing anything else.

I rebooted back to 8.2p6, ran a (somewhat longer than normal) scrub,
rebooted, then booted back to 9.0.  Seems fine now, and is finishing
its freebsd-update.  Weird....but at least it works.


On Sat, Feb 18, 2012 at 3:06 PM, Dennis Glatting <dg17@penx.com> wrote:
> I'm not a ZFS wiz but...
>
>
> On Sat, 2012-02-18 at 10:25 -0500, Michael Shuey wrote:
>> I'm upgrading a server from 8.2p6 to 9.0-RELEASE, and I've tried both
>> make in the source tree and freebsd-update and I get the same strange
>> result. =A0As soon as I boot to the fbsd9 kernel, even booting into
>> single-user mode, the pool's size is greatly reduced. =A0All filesystems
>> show 100% full (0 bytes free space), nothing can be written to the
>> pool (probably a side-effect of being 100% full), and dmesg shows
>> several of "Solaris: WARNING: metaslab_free_dva(): bad DVA
>> 0:5978620460544" warnings (with different numbers). =A0Switching kernels
>> back to the 8.2p6 kernel restores things to normal, but I'd really
>> like to finish my fbsd9 upgrade.
>>
>> The system is a 64-bit Intel box with 4 GB of memory, and 8 disks in a
>> raidz2 pool called "pool". =A0It's booted to the 8.2p6 kernel now, and
>> scrubbing the pool, but last time I did this (roughly a week ago) it
>> was fine. =A0/ is a gmirror, but /usr, /tmp, and /var all come from the
>> pool. =A0Normally, the pool has 1.2 TB of free space, and is version 15
>> (zfs version 4). =A0Some disks are WD drives, with 4k native sectors,
>> but some time ago I rebuilt the pool to use a native 4k sector size
>> (ashift=3D12).
>>
>
> I believe 4GB of memory is the minimum. More is better. When you use the
> minimum of anything, expect dodginess.
>
> You should upgrade your pool -- bug fixes and all that.
>
> Are all the disks 4k sectors? I found that a mix of 512 and 4k work but
> performance is best when they are all the same. I have also found 512
> emulation isn't a believable choice when looking at performance (i.e.,
> set for 4k).
>
> Different people have different opinions but I personally do not use ZFS
> for the OS, rather I RAID1 the OS. The question you have to ask is
> if /usr goes kablewie whether you have he skills to put it back
> together. I do not, so "simple" (i.e., hardware RAID1) for the OS is
> good for me -- it isn't the OS that's being worked in my setups, rather
> the data areas.
>
>
>> Over time, I've been slowly replacing disks (1 at a time) to increase
>> the free space in the pool. =A0Also, the system experienced severe
>> failure recently; the power supply blew, and took out the memory (and
>> presumably motherboard). =A0I replaced these last week with known-good
>> board/memory/processor/PS, and it's been running fine since.
>>
>
> Expect mixed results with mixed disks, at least from my experience,
> particularly when it comes to performance.
>
> Is the MB the same? I have had mixed results. I find the Gigabyte boards
> work well but ASUS dodgy when it comes to high interrupt handling.
> Server boards with ECC memory are the most reliable.
>
>
>> Any suggestions? =A0Is it possible I've got some nasty pool corruption
>> going on - and if so, how do I go about fixing it? =A0Any advice would
>> be appreciated. =A0This is a backup server, so I could rebuild its
>> contents from the primary, but I'd rather fix it if possible (since I
>> want to do a fbsd9 upgrade on the primary next).
>
> I screw around with my set ups. What I found is rebuilding the pool
> (when I screw it up) is the least troublesome approach.
>
> Recently I found a tray bad on one of my servers. Drove me nuts for two
> weeks. It could be a loose cable, or bad cable, or crimped cable, but I
> am not yet in the position to open the case. Most of my ZFS weirdnesses
> have been hardware related.
>
> It could be your blowout impacted your disks or wiring. Do you SMART? I
> found, generally, SMART is goodness but I presently have a question mark
> when it comes to the Hitachi 4TB disks (I misbehaved on that system so
> then issue could be my own; however on another system there wasn't any
> errors).
>
> I have found, when I have multiple, identical controllers, that the same
> firmware across the controllers is a good approach, otherwise weirdness
> and different MBs manifest this problem in different ways. Also, make
> sure your MB's BIOS is recent.
>
> YMMV
>
>
>
>