From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 02:13:21 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 58B73106566B
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 02:13:21 +0000 (UTC)
	(envelope-from hackish@gmail.com)
Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com
	[209.85.214.54])
	by mx1.freebsd.org (Postfix) with ESMTP id D6CFD8FC16
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 02:13:20 +0000 (UTC)
Received: by bkvi17 with SMTP id i17so4221505bkv.13
	for <freebsd-fs@freebsd.org>; Sat, 05 May 2012 19:13:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:date:message-id:subject:from:to:content-type;
	bh=lW4bEV4p2/xLUVAiTlx/gbw6bu3FtLoYqan4YYBG8nk=;
	b=wf8MfXwXQjUWNmPooveiEokbQz1wJ3Kp80r37UDSDCVZIpCXvhiuDw3KV7hNqN+sCZ
	nf43tT1E8oHmAFhELgSS4mjQ29AcjYrnoDQCuygaewogornbmp69q/qdyoqIPKjrYe2s
	U2F41l0WhxYsMf6BXGB93ejv0pLHhJ7TEudw50mL8uFddWhPFqRUM9KvdSP/fVmcaL2H
	CZcS78Q2sBSOXNYwRw86fk7Ozt/5ZPlRM7thgkr+0+41ZGkdSWmMuDjf1BOz87IvCkVa
	m/rY45CuQzZVy/FdozcuIwYvnVbilYPnDJh9krcY5ku9uJEuURpRe5FHBmAbIz3ZaNi2
	vpoQ==
MIME-Version: 1.0
Received: by 10.205.131.7 with SMTP id ho7mr2369820bkc.62.1336270399710; Sat,
	05 May 2012 19:13:19 -0700 (PDT)
Received: by 10.205.82.205 with HTTP; Sat, 5 May 2012 19:13:19 -0700 (PDT)
Date: Sat, 5 May 2012 22:13:19 -0400
Message-ID: <CAPUouH3zgnGdzbe=0x4M32_1D-9J-E=_y-BP1zhyu-axBxsjwA@mail.gmail.com>
From: Michael Richards <hackish@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 02:13:21 -0000

Originally I had an 8.1 server setup on a 32bit kernel. The OS is on a
UFS filesystem and (it's a mail server) the business part of the
operation is on ZFS.

One day it crashed with an odd kernel panic. I assumed it was a memory
issue so I had more RAM installed. I tried to get a PAE kernel working
to use this extra ram but it was crashing every few hours.

Suspecting a hardware issue all the hardware was replaced.

I had some difficulty trying to figure out how to mount my old ZFS
partition but eventually did so.

zpool import shows this:
pool: email
    id: 10433152746165646153
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

        email       ONLINE
          ada1s1g   ONLINE

zpool import -f -R /altroot 10433152746165646153 olddata
panics the kernel. Similar panic as seen in all the other kernel versions.

http://forums.freebsd.org/attachment.php?attachmentid=1545&stc=1&d=1336261809

Shows a typical kernel panic.

http://forums.freebsd.org/showthread.php?t=31820

Gives a bit more info about things I've tried. Whatever it is seems to
affect a wide variety of kernels.

Any ideas?

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 06:11:03 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 39171106564A
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 06:11:03 +0000 (UTC)
	(envelope-from artemb@gmail.com)
Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com
	[209.85.215.54])
	by mx1.freebsd.org (Postfix) with ESMTP id A281D8FC0C
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 06:11:02 +0000 (UTC)
Received: by lagv3 with SMTP id v3so3868435lag.13
	for <freebsd-fs@freebsd.org>; Sat, 05 May 2012 23:11:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type;
	bh=E0VW8wKTRqXrRME1J4cytVfm//mysDvG+hl2n99rewc=;
	b=A5pKWOathTHKqv/vgQnb9evXrOBgkc0B0qjhtBv0CAekyTg11unxOqesF4qy4kLh/f
	j7O0YvHeh0tboB62biEvwkvXIz8n0y0eLnlZ28Lee7sq9/lOYSVc6AINoLj0rCdh6vAo
	hlalPbCA4q0coNH45Cpoqxa0W+oTOXq1mKzGLKhxKZpE1nEqAlk1DeIxbsQNcZOlJkDg
	/ef4kbh7DHygiu1ywX+9J6Gz+V80p68fqMlIKrndYfY0EJuKjGH1agErnLub7tzOE+b+
	9jvYmyMxVGnQC0CmtHHnBI5jDklgtBAz0E+FDYvIgsnlUh8ehZ+veeX95V0wwcx3BYKp
	05dw==
MIME-Version: 1.0
Received: by 10.152.129.74 with SMTP id nu10mr10414024lab.50.1336284661240;
	Sat, 05 May 2012 23:11:01 -0700 (PDT)
Sender: artemb@gmail.com
Received: by 10.112.2.5 with HTTP; Sat, 5 May 2012 23:11:01 -0700 (PDT)
In-Reply-To: <CAPUouH3zgnGdzbe=0x4M32_1D-9J-E=_y-BP1zhyu-axBxsjwA@mail.gmail.com>
References: <CAPUouH3zgnGdzbe=0x4M32_1D-9J-E=_y-BP1zhyu-axBxsjwA@mail.gmail.com>
Date: Sat, 5 May 2012 23:11:01 -0700
X-Google-Sender-Auth: BWwuTikksOcymGUE1W1pSQIhiB8
Message-ID: <CAFqOu6gz+Fd-NvPivMz3nfeGCYz0a563yNBOpmsAyHZS_TQybQ@mail.gmail.com>
From: Artem Belevich <art@freebsd.org>
To: Michael Richards <hackish@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 06:11:03 -0000

I believe I've ran into this issue couple three times. In all cases
the culprit was memory corruption. If were to guess, corruption
damaged critical data *before* ZFS calculated checksum and was able to
write it to disk. Once that happened, kernel would panic every time
once the pool was in use. Crashes could happen as soon as zpool import
or as late as after few days of uptime or next scheduled scrub. I even
tried importing/scrubbing the pool on opensolaris without much success
-- while solaris didn't crash outright, it failed to import the pool
with internal assertion.

On Sat, May 5, 2012 at 7:13 PM, Michael Richards <hackish@gmail.com> wrote:
> Originally I had an 8.1 server setup on a 32bit kernel. The OS is on a
> UFS filesystem and (it's a mail server) the business part of the
> operation is on ZFS.
>
> One day it crashed with an odd kernel panic. I assumed it was a memory
> issue so I had more RAM installed. I tried to get a PAE kernel working
> to use this extra ram but it was crashing every few hours.
>
> Suspecting a hardware issue all the hardware was replaced.

Bad memory could indeed do that.

> I had some difficulty trying to figure out how to mount my old ZFS
> partition but eventually did so.
...
> zpool import -f -R /altroot 10433152746165646153 olddata
> panics the kernel. Similar panic as seen in all the other kernel versions.


> Gives a bit more info about things I've tried. Whatever it is seems to
> affect a wide variety of kernels.

Kernel is just a messenger here. The root cause is that while ZFS does
go an extra mile or two in order to ensure data consistency, there's
only so much it can do if RAM is bad. Once that kind of problem
happened, it may leave the pool in a state that ZFS will not be able
to deal with out of the box.

Not everything may be lost, though.

First of all -- make a copy of your pool, if it's feasible.
Probability of screwing it up even more is rather high.

ZFS internally keeps large number of uberblocks. Each uberblock is
sort of a periodic checkpoint of the pool state after ZFS writes next
transaction group (every 10-40 sec, depending on vfs.zfs.txg.timeout
sysctl, more often if there are a lot of ongoing write activity).
Basically you need to destroy the most recent uberblock to manually
roll-back your ZFS pool. Hopefully, you'll only need to nuke few most
recent ones to restore the pool to the point before corruption ruined
it.

Now, ZFS keeps multiple copies of uberblocks. You will need to nuke
*all* instances of the most recent uberblock in order to roll pool
state backwards.

Solaris internals site seems to have a script to do that now (I wish I
knew about it back when I needed it):
http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script

Good luck!

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 06:21:50 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 06CA3106566C
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 06:21:50 +0000 (UTC)
	(envelope-from skvortsov42@gmail.com)
Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com
	[209.85.212.172])
	by mx1.freebsd.org (Postfix) with ESMTP id 8D5918FC0A
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 06:21:49 +0000 (UTC)
Received: by wibhj6 with SMTP id hj6so2063320wib.13
	for <freebsd-fs@freebsd.org>; Sat, 05 May 2012 23:21:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:date:message-id:subject:from:to:content-type;
	bh=jN+0c3MEpPch6G+qXL3FsLOp8agFTxxTLmAdrTM1+8s=;
	b=evMoQZOiQo2LuG30uFIZ+E0u1Bst10izPtEIgOVr6skZIod7cAy58oAw3xda57stUd
	8ivSTO5NLDiGd2rZvU2cNQMZm4mRR4HvDJI/pafSwFctd3QktqA0ITjTvKJx6mpXJ9B2
	V1tkWPfJFh8JMRbS36PUNeKG2ksZZFQki+uXVEh+SGCHm/+1vOH16Dl942isOtTt3S/D
	+tZJtbXU+cMlVU7zrtwFDDzQvDFmJvbt6huwRGgfEpUpRRSFwANW/F4UfmFv1UOtXMox
	aVvPVp6uTGrxC9KJFWYB6cw028qHuYxRpwzs2vTCUwj6yZNivnVTINDtvxtY7xrnS3Uf
	ywkw==
MIME-Version: 1.0
Received: by 10.216.198.154 with SMTP id v26mr7340909wen.74.1336285303545;
	Sat, 05 May 2012 23:21:43 -0700 (PDT)
Received: by 10.216.143.222 with HTTP; Sat, 5 May 2012 23:21:43 -0700 (PDT)
Date: Sat, 5 May 2012 23:21:43 -0700
Message-ID: <CA+Znuqw8YpOD3fV6+eoGLqH6J+pmafaw=c_iaMUNRK7TUe39+w@mail.gmail.com>
From: Chris <skvortsov42@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
Subject: ZFS 4K drive overhead
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 06:21:50 -0000

Hi all,

I'm planning on making a raidz2 with 6 2 TB drives - all 4K sectors,
all reporting as 512 bytes. I've been reading some disturbing things
about ZFS when used on 4K drives. In this discussion
(http://mail.opensolaris.org/pipermail/zfs-discuss/2011-October/049959.html),
Jim Klimov pointed out that when ZFS is used with ashift=12, the
metadata overhead for a filesystem with a lot of small files can reach
100% (http://mail.opensolaris.org/pipermail/zfs-discuss/2011-October/049960.html)!
That seems pretty bad to me. My questions are:

Does anyone on this list have experience using ZFS on 4K drives with
ashift=12? Is the overhead per file, such that having a relatively
large average filesize, say, 19 MB, would render it insignificant? Or
would the overhead be large regardless?

What is the speed penalty for using ashift=9 on the array? Is the
safety of the data on the array an issue  (due to how ZFS can't write
to a 512 byte sector but it's coded with the assumption that it can
thus making it no longer strictly copy-on-write)? Does anyone have any
experience with ashift=9 arrays on 4K drives?

Thanks in advance.

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 08:46:41 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 88E751065672
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 08:46:41 +0000 (UTC)
	(envelope-from 000.fbsd@quip.cz)
Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4])
	by mx1.freebsd.org (Postfix) with ESMTP id 412BE8FC12
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 08:46:40 +0000 (UTC)
Received: from elsa.codelab.cz (localhost [127.0.0.1])
	by elsa.codelab.cz (Postfix) with ESMTP id 0FA3B28424;
	Sun,  6 May 2012 10:46:34 +0200 (CEST)
Received: from [192.168.1.2] (static-84-242-120-26.net.upcbroadband.cz
	[84.242.120.26])
	(using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
	(No client certificate requested)
	by elsa.codelab.cz (Postfix) with ESMTPSA id B363A28423;
	Sun,  6 May 2012 10:46:32 +0200 (CEST)
Message-ID: <4FA63A68.3050607@quip.cz>
Date: Sun, 06 May 2012 10:46:32 +0200
From: Miroslav Lachman <000.fbsd@quip.cz>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;
	rv:1.9.1.19) Gecko/20110420 Lightning/1.0b1 SeaMonkey/2.0.14
MIME-Version: 1.0
To: Chris <skvortsov42@gmail.com>
References: <CA+Znuqw8YpOD3fV6+eoGLqH6J+pmafaw=c_iaMUNRK7TUe39+w@mail.gmail.com>
In-Reply-To: <CA+Znuqw8YpOD3fV6+eoGLqH6J+pmafaw=c_iaMUNRK7TUe39+w@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS 4K drive overhead
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 08:46:41 -0000

Chris wrote:
> Hi all,
>
> I'm planning on making a raidz2 with 6 2 TB drives - all 4K sectors,
> all reporting as 512 bytes. I've been reading some disturbing things
> about ZFS when used on 4K drives. In this discussion
> (http://mail.opensolaris.org/pipermail/zfs-discuss/2011-October/049959.html),
> Jim Klimov pointed out that when ZFS is used with ashift=12, the
> metadata overhead for a filesystem with a lot of small files can reach
> 100% (http://mail.opensolaris.org/pipermail/zfs-discuss/2011-October/049960.html)!
> That seems pretty bad to me. My questions are:
>
> Does anyone on this list have experience using ZFS on 4K drives with
> ashift=12? Is the overhead per file, such that having a relatively
> large average filesize, say, 19 MB, would render it insignificant? Or
> would the overhead be large regardless?

Average size of 19MB is much more larger than 4k (metadata), the 
overhead will be not so high as with really small files (files with size 
of few kB).

> What is the speed penalty for using ashift=9 on the array? Is the
> safety of the data on the array an issue  (due to how ZFS can't write
> to a 512 byte sector but it's coded with the assumption that it can
> thus making it no longer strictly copy-on-write)? Does anyone have any
> experience with ashift=9 arrays on 4K drives?

Even if the overhead will be larger, the speed penalty is much higher. 
You should read about it in some post on this blog:

http://blog.des.no/search/label/freebsd

There are various articles with banchmarks of 4k sectors drives and some 
of them are almost useless with unaligned writes. So I strongly 
recommend you to use 4k (ashift=12).

Use ashift=9 only if performance doesn't metter and you are concerned 
only on available space.

Miroslav Lachman

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 11:59:56 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7A422106566B;
	Sun,  6 May 2012 11:59:56 +0000 (UTC)
	(envelope-from hackish@gmail.com)
Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 3236F8FC08;
	Sun,  6 May 2012 11:59:56 +0000 (UTC)
Received: by obcni5 with SMTP id ni5so8963270obc.13
	for <multiple recipients>; Sun, 06 May 2012 04:59:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=33bIbskxxD4Bsjv70Y9Cxc7M30hrvJKprD5mq6XmxeM=;
	b=bdN8V046l8p87uz+9nE/vcyfzQy/EO2Ogjmo8sp3umbHTLps9ZlNNHZlJSxFu0xQET
	rfeVnZx1dVHceSovColQIWfZIZzSG6g94YzY03sRCBSko4Ek11LqwjYiw8pPpGIrrsjl
	kLkHyyY+Rz7ao2dcleZmGRAirnuqZBAFq5Q95kaUFMduiUOws2n4iW5Fd9I0t3mPdbRS
	ruw/eyCqa+Ec/hartuWlDZ9AkEIgCR046LM3NKXX6MeMgR042Bddj/1FvP7Pgh+pd9Ny
	SwbUhju/2RTR/ZEVU6oqR7sAEVcpNtIfu1mN3EZBNif81HvpdwViyotMOA5M+2GdkJvT
	XEIQ==
MIME-Version: 1.0
Received: by 10.182.152.72 with SMTP id uw8mr16610179obb.73.1336305595858;
	Sun, 06 May 2012 04:59:55 -0700 (PDT)
Received: by 10.182.27.130 with HTTP; Sun, 6 May 2012 04:59:55 -0700 (PDT)
In-Reply-To: <CAFqOu6gz+Fd-NvPivMz3nfeGCYz0a563yNBOpmsAyHZS_TQybQ@mail.gmail.com>
References: <CAPUouH3zgnGdzbe=0x4M32_1D-9J-E=_y-BP1zhyu-axBxsjwA@mail.gmail.com>
	<CAFqOu6gz+Fd-NvPivMz3nfeGCYz0a563yNBOpmsAyHZS_TQybQ@mail.gmail.com>
Date: Sun, 6 May 2012 07:59:55 -0400
Message-ID: <CAPUouH3EUhrochyyWm1J6ZLS8-q-60mbWkNnOnbvmTuiVvEbDg@mail.gmail.com>
From: Michael Richards <hackish@gmail.com>
To: Artem Belevich <art@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 11:59:56 -0000

>> Suspecting a hardware issue all the hardware was replaced.
>
> Bad memory could indeed do that.

Indeed the memory was the bad hardware.

>
>> I had some difficulty trying to figure out how to mount my old ZFS
>> partition but eventually did so.
> ...
>> zpool import -f -R /altroot 10433152746165646153 olddata
>> panics the kernel. Similar panic as seen in all the other kernel versions.
>
>
>> Gives a bit more info about things I've tried. Whatever it is seems to
>> affect a wide variety of kernels.
>
> Kernel is just a messenger here. The root cause is that while ZFS does
> go an extra mile or two in order to ensure data consistency, there's
> only so much it can do if RAM is bad. Once that kind of problem
> happened, it may leave the pool in a state that ZFS will not be able
> to deal with out of the box.

I believe that the kernel should be able to handle every type of
non-hardware corruption without a panic. At present I'm in the process
of running
dd if=/dev/ad16s1g > zfsimage.dat
with hopes I can send that file home and play with it locally. I'm not
a kernel hacker but I'll see what I can do. Based on the backtrace I
suspect it is happening during scrub:
...
trap_fatal
trap_pfault
calltrap
zio_vdev_io_start
zio_execute
dsl_scan_scrub_cb

> First of all -- make a copy of your pool, if it's feasible.
> Probability of screwing it up even more is rather high.

I assume running the dd on that slice will do this. I think I read
somewhere that you can specify a file instead of a block device when
loading a zfs filesystem.

I'll see what I can do so the problem itself can be fixed.

-Michael

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 12:38:26 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 412881065672
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 12:38:26 +0000 (UTC)
	(envelope-from simon@optinet.com)
Received: from cobra.acceleratedweb.net (cobra-gw.acceleratedweb.net
	[207.99.79.37]) by mx1.freebsd.org (Postfix) with SMTP id E1A9B8FC17
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 12:38:25 +0000 (UTC)
Received: (qmail 12043 invoked by uid 110); 6 May 2012 12:38:18 -0000
Received: from ool-4571afe7.dyn.optonline.net (HELO desktop1)
	(simon@optinet.com@69.113.175.231)
	by cobra.acceleratedweb.net with SMTP; 6 May 2012 12:38:18 -0000
From: "Simon" <simon@optinet.com>
To: "Artem Belevich" <art@freebsd.org>
Date: Sun, 06 May 2012 08:38:18 -0400
Priority: Normal
X-Mailer: PMMail 2000 Professional (2.20.2717) For Windows 2000 (5.1.2600;3)
In-Reply-To: <CAFqOu6gz+Fd-NvPivMz3nfeGCYz0a563yNBOpmsAyHZS_TQybQ@mail.gmail.com>
MIME-Version: 1.0
Message-Id: <20120506123826.412881065672@hub.freebsd.org>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 12:38:26 -0000


Are you suggesting that if a disk sector goes bad or memory corrupts few blocks
of data, the entire zpool is gonna go bust? can the same occur with a ZRAID?
I thought the ZFS was designed to overcome all these issues to begin with. Is
this not the case?

-Simon

On Sat, 5 May 2012 23:11:01 -0700, Artem Belevich wrote:

>I believe I've ran into this issue couple three times. In all cases
>the culprit was memory corruption. If were to guess, corruption
>damaged critical data *before* ZFS calculated checksum and was able to
>write it to disk. Once that happened, kernel would panic every time
>once the pool was in use. Crashes could happen as soon as zpool import
>or as late as after few days of uptime or next scheduled scrub. I even
>tried importing/scrubbing the pool on opensolaris without much success
>-- while solaris didn't crash outright, it failed to import the pool
>with internal assertion.

>On Sat, May 5, 2012 at 7:13 PM, Michael Richards <hackish@gmail.com> wrote:
>> Originally I had an 8.1 server setup on a 32bit kernel. The OS is on a
>> UFS filesystem and (it's a mail server) the business part of the
>> operation is on ZFS.
>>
>> One day it crashed with an odd kernel panic. I assumed it was a memory
>> issue so I had more RAM installed. I tried to get a PAE kernel working
>> to use this extra ram but it was crashing every few hours.
>>
>> Suspecting a hardware issue all the hardware was replaced.

>Bad memory could indeed do that.

>> I had some difficulty trying to figure out how to mount my old ZFS
>> partition but eventually did so.
>...
>> zpool import -f -R /altroot 10433152746165646153 olddata
>> panics the kernel. Similar panic as seen in all the other kernel versions.


>> Gives a bit more info about things I've tried. Whatever it is seems to
>> affect a wide variety of kernels.

>Kernel is just a messenger here. The root cause is that while ZFS does
>go an extra mile or two in order to ensure data consistency, there's
>only so much it can do if RAM is bad. Once that kind of problem
>happened, it may leave the pool in a state that ZFS will not be able
>to deal with out of the box.

>Not everything may be lost, though.

>First of all -- make a copy of your pool, if it's feasible.
>Probability of screwing it up even more is rather high.

>ZFS internally keeps large number of uberblocks. Each uberblock is
>sort of a periodic checkpoint of the pool state after ZFS writes next
>transaction group (every 10-40 sec, depending on vfs.zfs.txg.timeout
>sysctl, more often if there are a lot of ongoing write activity).
>Basically you need to destroy the most recent uberblock to manually
>roll-back your ZFS pool. Hopefully, you'll only need to nuke few most
>recent ones to restore the pool to the point before corruption ruined
>it.

>Now, ZFS keeps multiple copies of uberblocks. You will need to nuke
>*all* instances of the most recent uberblock in order to roll pool
>state backwards.

>Solaris internals site seems to have a script to do that now (I wish I
>knew about it back when I needed it):
>http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script

>Good luck!

>--Artem
>_______________________________________________
>freebsd-fs@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 13:55:32 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9291C1065740
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 13:55:32 +0000 (UTC)
	(envelope-from shuey@fmepnet.org)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 48F048FC18
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 13:55:32 +0000 (UTC)
Received: by vbmv11 with SMTP id v11so3869218vbm.13
	for <freebsd-fs@freebsd.org>; Sun, 06 May 2012 06:55:31 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:x-originating-ip:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding:x-gm-message-state;
	bh=FVOpXmPnkURp/Cne2dGFz8p3C73smi4qbki+Nju5OPI=;
	b=iT7k1Lp0KWyWRQ7K5sH9YnhobTUmpfUByPS9w/mpyxu83iO8z7Pw/vvQNxtbiikJJG
	GMAGHMvzUXhEtBS7hwSAiC5tbxfuvAKOvY9A1N+ANQrJyP9Cg+XNInm/1trrJWk0yZUC
	KA9jQn1A/tf+wLqcgvqcMzxqW8MMEXgSwPOATKsliqtTFrWiHKyJQsDl1rNsPXjH1CDN
	ZFCGLPR5c48pxlP5EvM6p4q1qsLjzDRgop6yaprP88YC222Nrqbpf14E0DPUXqKrbXdv
	Z2inRSRBMLwXaR3gqzFxsz9H+gSeeQ9Qz94P7yr84ry2SAz5xq3yYy4SqDax+H4D42L8
	ke3w==
MIME-Version: 1.0
Received: by 10.52.21.51 with SMTP id s19mr5219668vde.35.1336312173697; Sun,
	06 May 2012 06:49:33 -0700 (PDT)
Received: by 10.220.210.71 with HTTP; Sun, 6 May 2012 06:49:33 -0700 (PDT)
X-Originating-IP: [98.223.59.225]
In-Reply-To: <4FA63A68.3050607@quip.cz>
References: <CA+Znuqw8YpOD3fV6+eoGLqH6J+pmafaw=c_iaMUNRK7TUe39+w@mail.gmail.com>
	<4FA63A68.3050607@quip.cz>
Date: Sun, 6 May 2012 09:49:33 -0400
Message-ID: <CAELRr5kZohBbKMBoScezNeYBw5mV1k9XcLN+74W2qavUnwvYUQ@mail.gmail.com>
From: Michael Shuey <shuey@fmepnet.org>
To: Miroslav Lachman <000.fbsd@quip.cz>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQmxFXk2McffXQE61w9q0U1c41259WKc1MZCivoSOHEUrZWk398Rah2x7pKkgfB46awd0iwp
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS 4K drive overhead
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 13:55:32 -0000

A couple months back, I finished rebuilding my spools to use
ashift=3D12, after trying a 4k drive in a pool with ashift=3D9.  If you
try a 4k drive on an ashift=3D9 pool, you're going to have a bad time.

Performance for occasional IO (particularly streaming) isn't too bad
with mis-aligned sectors.  However, resilvering time is MUCH, MUCH,
MUCH higher - I saw estimates for resilver completion go up by over an
order of magnitude, and pool performance become nearly unusable while
a resilver was in operation.

ZFS will dynamically adjust block size for a file, between the
smallest block size the media supports and 128k or so (IIRC).  That
means that even if you align a partition on your 4k disk, or use the
raw disk itself (so ZFS starts on an aligned sector), after the first
small file is written you'll be doing un-aligned IOs.  Resilvering a
1.5 TB drive was estimated at over 230 hours for me; it was actually
less time to abort and rebuild the server from backups.

Given the propensity of 4k drives on the market now, and the
likelihood that they'll be the only product available in the future,
I'd highly recommend using ashift=3D12 on any new zpools.  It's time to
stop using ashift=3D9.


On Sun, May 6, 2012 at 4:46 AM, Miroslav Lachman <000.fbsd@quip.cz> wrote:
> Chris wrote:
>>
>> Hi all,
>>
>> I'm planning on making a raidz2 with 6 2 TB drives - all 4K sectors,
>> all reporting as 512 bytes. I've been reading some disturbing things
>> about ZFS when used on 4K drives. In this discussion
>>
>> (http://mail.opensolaris.org/pipermail/zfs-discuss/2011-October/049959.h=
tml),
>> Jim Klimov pointed out that when ZFS is used with ashift=3D12, the
>> metadata overhead for a filesystem with a lot of small files can reach
>> 100%
>> (http://mail.opensolaris.org/pipermail/zfs-discuss/2011-October/049960.h=
tml)!
>> That seems pretty bad to me. My questions are:
>>
>> Does anyone on this list have experience using ZFS on 4K drives with
>> ashift=3D12? Is the overhead per file, such that having a relatively
>> large average filesize, say, 19 MB, would render it insignificant? Or
>> would the overhead be large regardless?
>
>
> Average size of 19MB is much more larger than 4k (metadata), the overhead
> will be not so high as with really small files (files with size of few kB=
).
>
>
>> What is the speed penalty for using ashift=3D9 on the array? Is the
>> safety of the data on the array an issue =A0(due to how ZFS can't write
>> to a 512 byte sector but it's coded with the assumption that it can
>> thus making it no longer strictly copy-on-write)? Does anyone have any
>> experience with ashift=3D9 arrays on 4K drives?
>
>
> Even if the overhead will be larger, the speed penalty is much higher. Yo=
u
> should read about it in some post on this blog:
>
> http://blog.des.no/search/label/freebsd
>
> There are various articles with banchmarks of 4k sectors drives and some =
of
> them are almost useless with unaligned writes. So I strongly recommend yo=
u
> to use 4k (ashift=3D12).
>
> Use ashift=3D9 only if performance doesn't metter and you are concerned o=
nly
> on available space.
>
> Miroslav Lachman
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 15:14:13 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EFE19106566B
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 15:14:13 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id B28868FC08
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 15:14:13 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id
	q46Exd1s011128; Sun, 6 May 2012 09:59:40 -0500 (CDT)
Date: Sun, 6 May 2012 09:59:39 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Simon <simon@optinet.com>
In-Reply-To: <20120506123826.412881065672@hub.freebsd.org>
Message-ID: <alpine.GSO.2.01.1205060955450.1678@freddy.simplesystems.org>
References: <20120506123826.412881065672@hub.freebsd.org>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Sun, 06 May 2012 09:59:40 -0500 (CDT)
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 15:14:14 -0000

On Sun, 6 May 2012, Simon wrote:

>
> Are you suggesting that if a disk sector goes bad or memory corrupts few blocks
> of data, the entire zpool is gonna go bust? can the same occur with a ZRAID?
> I thought the ZFS was designed to overcome all these issues to begin with. Is
> this not the case?

ZFS is designed to work with failing disks, but not failing memory. 
It is recommended to use only systems with ECC memory.

The OS itself (any OS!) is succeptible to crash/corruption due to 
failing memory but without zfs's checksums, you might not be aware of 
such corruptions or the crash might be more delayed.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 15:14:43 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 82E48106566B
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 15:14:43 +0000 (UTC)
	(envelope-from simon@optinet.com)
Received: from cobra.acceleratedweb.net (cobra-gw.acceleratedweb.net
	[207.99.79.37]) by mx1.freebsd.org (Postfix) with SMTP id 2DA6E8FC08
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 15:14:42 +0000 (UTC)
Received: (qmail 20243 invoked by uid 110); 6 May 2012 15:14:41 -0000
Received: from ool-4571afe7.dyn.optonline.net (HELO desktop1)
	(simon@optinet.com@69.113.175.231)
	by cobra.acceleratedweb.net with SMTP; 6 May 2012 15:14:41 -0000
From: "Simon" <simon@optinet.com>
To: "Bob Friesenhahn" <bfriesen@simple.dallas.tx.us>
Date: Sun, 06 May 2012 11:14:42 -0400
Priority: Normal
X-Mailer: PMMail 2000 Professional (2.20.2717) For Windows 2000 (5.1.2600;3)
In-Reply-To: <alpine.GSO.2.01.1205060955450.1678@freddy.simplesystems.org>
MIME-Version: 1.0
Message-Id: <20120506151443.82E48106566B@hub.freebsd.org>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 15:14:43 -0000


So if you have a 50TB ZFS filesystem and your memory goes bad, even if ECC,
your entire 50TB is gonna go bunkers? disks fail, but memory doesn't? CPUs
don't fail?

There are many things in a server that can fail and cause corruption, but that
shouldn't take down entire zpool. I'm okay with a few missing files ending up
in lost+found, but entire filesystem? That renders the entire thing useless if you
ask me.

-Simon

On Sun, 6 May 2012 09:59:39 -0500 (CDT), Bob Friesenhahn wrote:

>On Sun, 6 May 2012, Simon wrote:

>>
>> Are you suggesting that if a disk sector goes bad or memory corrupts few blocks
>> of data, the entire zpool is gonna go bust? can the same occur with a ZRAID?
>> I thought the ZFS was designed to overcome all these issues to begin with. Is
>> this not the case?

>ZFS is designed to work with failing disks, but not failing memory. 
>It is recommended to use only systems with ECC memory.

>The OS itself (any OS!) is succeptible to crash/corruption due to 
>failing memory but without zfs's checksums, you might not be aware of 
>such corruptions or the crash might be more delayed.

>Bob
>-- 
>Bob Friesenhahn
>bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
>GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/


From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 15:43:22 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0944F106566B
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 15:43:22 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id BDE528FC08
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 15:43:21 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id
	q46FhLXM011346; Sun, 6 May 2012 10:43:21 -0500 (CDT)
Date: Sun, 6 May 2012 10:43:21 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Simon <simon@optinet.com>
In-Reply-To: <201205061521.q46FLML1011267@blade.simplesystems.org>
Message-ID: <alpine.GSO.2.01.1205061034260.1678@freddy.simplesystems.org>
References: <201205061521.q46FLML1011267@blade.simplesystems.org>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Sun, 06 May 2012 10:43:21 -0500 (CDT)
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 15:43:22 -0000

On Sun, 6 May 2012, Simon wrote:

> 
> So if you have a 50TB ZFS filesystem and your memory goes bad, even if ECC,
> your entire 50TB is gonna go bunkers? disks fail, but memory doesn't? CPUs
> don't fail?
> 
> There are many things in a server that can fail and cause corruption, but that
> shouldn't take down entire zpool. I'm okay with a few missing files ending up
> in lost+found, but entire filesystem? That renders the entire thing useless if you
> ask me.

By your definition, computers would be useless. :-)

There is no telling what might happen if a program (including kernel 
code) was to execute wrong instructions or read wrong data.  This is 
not specific to zfs.

Zfs caches large amounts of data in its in-memory ARC cache, which is 
succeptible to in-memory corruption.  If it tried to detect and 
prevent memory corruption, it would be extremely slow and likely not 
work at all if there were actual failures.

Part of the metadata structure of the pool needs to be cached in RAM 
for performance reasons.

On the zfs-discuss list we sometimes hear of zfs checksum errors which 
are due to memory errors rather than disk errors.

Zfs can be used without ECC memory, but pool reliability will suffer.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 16:10:38 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0142B106566C
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 16:10:38 +0000 (UTC)
	(envelope-from hackish@gmail.com)
Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id B9AE88FC08
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 16:10:37 +0000 (UTC)
Received: by obcni5 with SMTP id ni5so9323252obc.13
	for <freebsd-fs@freebsd.org>; Sun, 06 May 2012 09:10:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=8y9KXQV4GF7qHdAM7JvLjffQyrDkEru4ESwm53v57tM=;
	b=Bq3tnaqrOytkqU2Ro7yquYVddufuVVDLVd+E1GEEV3+TF5N94HFk6/fL0juFHXFVEc
	isprUDv7o8XTpe8qzd8X+fCmft1iMGvisg7weLnVLXSkEkTxT9R/FOAapTdjWnnMr+ZS
	8bxKRobiw42GCm2zouYJvyxX/G2/kwS8c8idTdcaXqWrRzII94Vzshnj4IOiyehX3Etx
	A/cLV5BeVkp740bEL1FmcUiQ0c5czUPLR7JxMAniy8CyhIrpgRDMwbZGbAEKhII9kFIT
	21szhAX+ZxXjVjJszLo1zx9cOCr8Lg3x6eNfkuRKkkQJPMgAd6Ox5EWMAti2WkMKw/sK
	Vp1w==
MIME-Version: 1.0
Received: by 10.182.167.101 with SMTP id zn5mr6197992obb.13.1336320637350;
	Sun, 06 May 2012 09:10:37 -0700 (PDT)
Received: by 10.182.27.130 with HTTP; Sun, 6 May 2012 09:10:37 -0700 (PDT)
In-Reply-To: <alpine.GSO.2.01.1205060955450.1678@freddy.simplesystems.org>
References: <20120506123826.412881065672@hub.freebsd.org>
	<alpine.GSO.2.01.1205060955450.1678@freddy.simplesystems.org>
Date: Sun, 6 May 2012 12:10:37 -0400
Message-ID: <CAPUouH2Ftt4ZDtanPmfTXa9+HRCkLCL=HcGqT9VgV+BrQcYE3A@mail.gmail.com>
From: Michael Richards <hackish@gmail.com>
To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 16:10:38 -0000

On Sun, May 6, 2012 at 10:59 AM, Bob Friesenhahn
<bfriesen@simple.dallas.tx.us> wrote:
> On Sun, 6 May 2012, Simon wrote:
>
>>
>> Are you suggesting that if a disk sector goes bad or memory corrupts few
>> blocks
>> of data, the entire zpool is gonna go bust? can the same occur with a
>> ZRAID?
>> I thought the ZFS was designed to overcome all these issues to begin with.
>> Is
>> this not the case?
>
>
> ZFS is designed to work with failing disks, but not failing memory. It is
> recommended to use only systems with ECC memory.
>
> The OS itself (any OS!) is succeptible to crash/corruption due to failing
> memory but without zfs's checksums, you might not be aware of such
> corruptions or the crash might be more delayed.

I can accept the fact that some filesystem corruption may have
happened from the bad RAM. The issue now is recovering it. All the
hardware has been replaced but I cannot import the ZFS pool without
causing a kernel panic and that is the the problem here. To me it
matters not if the corruption occurred from RAM or the hard disk - I
don't think it's a good idea to blindly trust any filesystem data. At
minimum fail to import the pool but don't bring the entire system to a
halt. This isn't even a system drive - it's purely data.

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 17:50:08 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A565E106566B
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 17:50:08 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id 5C8638FC0C
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 17:50:08 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id
	q46Ho7Fm011767; Sun, 6 May 2012 12:50:07 -0500 (CDT)
Date: Sun, 6 May 2012 12:50:07 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Michael Richards <hackish@gmail.com>
In-Reply-To: <CAPUouH2Ftt4ZDtanPmfTXa9+HRCkLCL=HcGqT9VgV+BrQcYE3A@mail.gmail.com>
Message-ID: <alpine.GSO.2.01.1205061248070.1678@freddy.simplesystems.org>
References: <20120506123826.412881065672@hub.freebsd.org>
	<alpine.GSO.2.01.1205060955450.1678@freddy.simplesystems.org>
	<CAPUouH2Ftt4ZDtanPmfTXa9+HRCkLCL=HcGqT9VgV+BrQcYE3A@mail.gmail.com>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Sun, 06 May 2012 12:50:07 -0500 (CDT)
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 17:50:08 -0000

On Sun, 6 May 2012, Michael Richards wrote:
>
> I can accept the fact that some filesystem corruption may have
> happened from the bad RAM. The issue now is recovering it. All the
> hardware has been replaced but I cannot import the ZFS pool without
> causing a kernel panic and that is the the problem here. To me it
> matters not if the corruption occurred from RAM or the hard disk - I
> don't think it's a good idea to blindly trust any filesystem data. At
> minimum fail to import the pool but don't bring the entire system to a
> halt. This isn't even a system drive - it's purely data.

These are sentiments that I can agree with.  If the import can be so 
dangerous, it seems that there should be a way to import the pool in a 
user-mode (outside if kernel space) so that issues can be fixed 
without panicing the kernel.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Sun May  6 20:59:43 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 926D8106564A
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 20:59:43 +0000 (UTC)
	(envelope-from simon@optinet.com)
Received: from cobra.acceleratedweb.net (cobra-gw.acceleratedweb.net
	[207.99.79.37]) by mx1.freebsd.org (Postfix) with SMTP id 3554C8FC12
	for <freebsd-fs@freebsd.org>; Sun,  6 May 2012 20:59:42 +0000 (UTC)
Received: (qmail 42031 invoked by uid 110); 6 May 2012 20:59:41 -0000
Received: from ool-4571afe7.dyn.optonline.net (HELO desktop1)
	(simon@optinet.com@69.113.175.231)
	by cobra.acceleratedweb.net with SMTP; 6 May 2012 20:59:41 -0000
From: "Simon" <simon@optinet.com>
To: "Bob Friesenhahn" <bfriesen@simple.dallas.tx.us>
Date: Sun, 06 May 2012 16:59:41 -0400
Priority: Normal
X-Mailer: PMMail 2000 Professional (2.20.2717) For Windows 2000 (5.1.2600;3)
In-Reply-To: <alpine.GSO.2.01.1205060955450.1678@freddy.simplesystems.org>
MIME-Version: 1.0
Message-Id: <20120506205943.926D8106564A@hub.freebsd.org>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject: Re: ZFS Kernel Panics with 32 and 64 bit versions of 8.3 and 9.0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 06 May 2012 20:59:43 -0000


I fully understand the concept behind ECC memory and why it is important
to use it in a server environment, especially with a filesystem like ZFS.
However, I think my entire point was missed. It appears there are simply too
many documented cases where entire pool becomes inaccessible due to
limited corruption. Most of the data is still intact but there is no way to recover
it. And there is no way to fix the limited inconsistencies which caused the entire
pool to become unimportable to begin with. Again, I'm okay with few corrupted
or missing files. I'm not okay with entire pool becoming inaccessible due to
limited corruption due to faulty memory or otherwise. There needs to be a way
to import a corrupted zpool and at very least to be able to read the remaining
intact data.

-Simon


On Sun, 6 May 2012 09:59:39 -0500 (CDT), Bob Friesenhahn wrote:

>On Sun, 6 May 2012, Simon wrote:

>>
>> Are you suggesting that if a disk sector goes bad or memory corrupts few blocks
>> of data, the entire zpool is gonna go bust? can the same occur with a ZRAID?
>> I thought the ZFS was designed to overcome all these issues to begin with. Is
>> this not the case?

>ZFS is designed to work with failing disks, but not failing memory. 
>It is recommended to use only systems with ECC memory.

>The OS itself (any OS!) is succeptible to crash/corruption due to 
>failing memory but without zfs's checksums, you might not be aware of 
>such corruptions or the crash might be more delayed.

>Bob
>-- 
>Bob Friesenhahn
>bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
>GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/


From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 11:07:11 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id D58E31065678
	for <freebsd-fs@FreeBSD.org>; Mon,  7 May 2012 11:07:11 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id BEE118FC15
	for <freebsd-fs@FreeBSD.org>; Mon,  7 May 2012 11:07:11 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q47B7BHm072358
	for <freebsd-fs@FreeBSD.org>; Mon, 7 May 2012 11:07:11 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q47B7BQN072353
	for freebsd-fs@FreeBSD.org; Mon, 7 May 2012 11:07:11 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 7 May 2012 11:07:11 GMT
Message-Id: <201205071107.q47B7BQN072353@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-fs@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 11:07:11 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/167467  fs         [zfs][patch] improve zdb(8) manpage and help.
o kern/167447  fs         [zfs] [patch] patch to zfs rename -f to perform force 
o kern/167370  fs         [zfs][patch] Unnecessary break point on zfs_main.c.
o kern/167272  fs         [zfs] ZFS Disks reordering causes ZFS to pick the wron
o kern/167266  fs         [zfs] [nfs] ZFS + new NFS export (sharenfs) leads to N
o kern/167260  fs         [msdosfs] msdosfs disk was mounted the second time whe
o kern/167109  fs         [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene
o kern/167105  fs         [nfs] mount_nfs can not handle source exports wiht mor
o kern/167067  fs         [zfs] [panic] ZFS panics the server
o kern/167066  fs         [zfs] ZVOLs not appearing in /dev/zvol
o kern/167065  fs         [zfs] boot fails when a spare is the boot disk
o kern/167048  fs         [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF
o kern/166912  fs         [ufs] [panic] Panic after converting Softupdates to jo
o kern/166851  fs         [zfs] [hang] Copying directory from the mounted UFS di
o kern/166566  fs         [zfs] zfs split renders 2 disk (MBR based) mirror unbo
o kern/166477  fs         [nfs] NFS data corruption.
o kern/165950  fs         [ffs] SU+J and fsck problem
o kern/165923  fs         [nfs] Writing to NFS-backed mmapped files fails if flu
o kern/165521  fs         [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31
o kern/165392  fs         Multiple mkdir/rmdir fails with errno 31
o kern/165087  fs         [unionfs] lock violation in unionfs
o kern/164472  fs         [ufs] fsck -B panics on particular data inconsistency
o kern/164370  fs         [zfs] zfs destroy for snapshot fails on i386 and sparc
o kern/164261  fs         [nullfs] [patch] fix panic with NFS served from NULLFS
o kern/164256  fs         [zfs] device entry for volume is not created after zfs
o kern/164184  fs         [ufs] [panic] Kernel panic with ufs_makeinode
o kern/163801  fs         [md] [request] allow mfsBSD legacy installed in 'swap'
o kern/163770  fs         [zfs] [hang] LOR between zfs&syncer + vnlru leading to
o kern/163501  fs         [nfs] NFS exporting a dir and a subdir in that dir to 
o kern/162944  fs         [coda] Coda file system module looks broken in 9.0
o kern/162860  fs         [zfs] Cannot share ZFS filesystem to hosts with a hyph
o kern/162751  fs         [zfs] [panic] kernel panics during file operations
o kern/162591  fs         [nullfs] cross-filesystem nullfs does not work as expe
o kern/162519  fs         [zfs] "zpool import" relies on buggy realpath() behavi
o kern/162362  fs         [snapshots] [panic] ufs with snapshot(s) panics when g
o kern/161968  fs         [zfs] [hang] renaming snapshot with -r including a zvo
o kern/161897  fs         [zfs] [patch] zfs partition probing causing long delay
o kern/161864  fs         [ufs] removing journaling from UFS partition fails on 
o bin/161807   fs         [patch] add option for explicitly specifying metadata 
o kern/161579  fs         [smbfs] FreeBSD sometimes panics when an smb share is 
o kern/161533  fs         [zfs] [panic] zfs receive panic: system ioctl returnin
o kern/161511  fs         [unionfs] Filesystem deadlocks when using multiple uni
o kern/161438  fs         [zfs] [panic] recursed on non-recursive spa_namespace_
o kern/161424  fs         [nullfs] __getcwd() calls fail when used on nullfs mou
o kern/161280  fs         [zfs] Stack overflow in gptzfsboot
o kern/161205  fs         [nfs] [pfsync] [regression] [build] Bug report freebsd
o kern/161169  fs         [zfs] [panic] ZFS causes kernel panic in dbuf_dirty
o kern/161112  fs         [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3
o kern/160893  fs         [zfs] [panic] 9.0-BETA2 kernel panic
o kern/160860  fs         [ufs] Random UFS root filesystem corruption with SU+J 
o kern/160801  fs         [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o
o kern/160790  fs         [fusefs] [panic] VPUTX: negative ref count with FUSE
o kern/160777  fs         [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo
o kern/160706  fs         [zfs] zfs bootloader fails when a non-root vdev exists
o kern/160591  fs         [zfs] Fail to boot on zfs root with degraded raidz2 [r
o kern/160410  fs         [smbfs] [hang] smbfs hangs when transferring large fil
o kern/160283  fs         [zfs] [patch] 'zfs list' does abort in make_dataset_ha
o kern/159930  fs         [ufs] [panic] kernel core
o kern/159402  fs         [zfs][loader] symlinks cause I/O errors
o kern/159357  fs         [zfs] ZFS MAXNAMELEN macro has confusing name (off-by-
o kern/159356  fs         [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s
o kern/159351  fs         [nfs] [patch] - divide by zero in mountnfs()
o kern/159251  fs         [zfs] [request]: add FLETCHER4 as DEDUP hash option
o kern/159077  fs         [zfs] Can't cd .. with latest zfs version
o kern/159048  fs         [smbfs] smb mount corrupts large files
o kern/159045  fs         [zfs] [hang] ZFS scrub freezes system
o kern/158839  fs         [zfs] ZFS Bootloader Fails if there is a Dead Disk
o kern/158802  fs         amd(8) ICMP storm and unkillable process.
o kern/158231  fs         [nullfs] panic on unmounting nullfs mounted over ufs o
f kern/157929  fs         [nfs] NFS slow read
o kern/157399  fs         [zfs] trouble with: mdconfig force delete && zfs strip
o kern/157179  fs         [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov
o kern/156797  fs         [zfs] [panic] Double panic with FreeBSD 9-CURRENT and 
o kern/156781  fs         [zfs] zfs is losing the snapshot directory,
p kern/156545  fs         [ufs] mv could break UFS on SMP systems
o kern/156193  fs         [ufs] [hang] UFS snapshot hangs && deadlocks processes
o kern/156039  fs         [nullfs] [unionfs] nullfs + unionfs do not compose, re
o kern/155615  fs         [zfs] zfs v28 broken on sparc64 -current
o kern/155587  fs         [zfs] [panic] kernel panic with zfs
p kern/155411  fs         [regression] [8.2-release] [tmpfs]: mount: tmpfs : No 
o kern/155199  fs         [ext2fs] ext3fs mounted as ext2fs gives I/O errors
o bin/155104   fs         [zfs][patch] use /dev prefix by default when importing
o kern/154930  fs         [zfs] cannot delete/unlink file from full volume -> EN
o kern/154828  fs         [msdosfs] Unable to create directories on external USB
o kern/154491  fs         [smbfs] smb_co_lock: recursive lock for object 1
p kern/154228  fs         [md] md getting stuck in wdrain state
o kern/153996  fs         [zfs] zfs root mount error while kernel is not located
o kern/153753  fs         [zfs] ZFS v15 - grammatical error when attempting to u
o kern/153716  fs         [zfs] zpool scrub time remaining is incorrect
o kern/153695  fs         [patch] [zfs] Booting from zpool created on 4k-sector 
o kern/153680  fs         [xfs] 8.1 failing to mount XFS partitions
o kern/153520  fs         [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable
o kern/153418  fs         [zfs] [panic] Kernel Panic occurred writing to zfs vol
o kern/153351  fs         [zfs] locking directories/files in ZFS
o bin/153258   fs         [patch][zfs] creating ZVOLs requires `refreservation' 
s kern/153173  fs         [zfs] booting from a gzip-compressed dataset doesn't w
o kern/153126  fs         [zfs] vdev failure, zpool=peegel type=vdev.too_small
o kern/152022  fs         [nfs] nfs service hangs with linux client [regression]
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o bin/151713   fs         [patch] Bug in growfs(8) with respect to 32-bit overfl
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/151111  fs         [zfs] vnodes leakage during zfs unmount
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147560  fs         [zfs] [boot] Booting 8.1-PRERELEASE raidz system take 
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
f bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
p bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139597  fs         [patch] [tmpfs] tmpfs initializes va_gen but doesn't u
o kern/139564  fs         [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
p kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o kern/118318  fs         [nfs] NFS server hangs under special circumstances
o bin/118249   fs         [ufs] mv(1): moving a directory changes its mtime
o kern/118126  fs         [nfs] [patch] Poor NFS server write performance
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [cd9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o kern/88266   fs         [smbfs] smbfs does not implement UIO_NOCOPY and sendfi
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

275 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 11:48:01 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 59741106564A;
	Mon,  7 May 2012 11:48:01 +0000 (UTC)
	(envelope-from vermaden@interia.pl)
Received: from smtpo.poczta.interia.pl (smtpo.poczta.interia.pl
	[217.74.65.208])
	by mx1.freebsd.org (Postfix) with ESMTP id 0C32F8FC08;
	Mon,  7 May 2012 11:48:01 +0000 (UTC)
Date: Mon, 07 May 2012 13:47:53 +0200
From: vermaden <vermaden@interia.pl>
To: "Randal L. Schwartz" <merlyn@stonehenge.com>
X-Mailer: interia.pl/pf09
In-Reply-To: <86d36iycca.fsf@red.stonehenge.com>
References: <pfwjphodwzuumbfqsxjn@xcww> <86ipgbg2p6.fsf@red.stonehenge.com>
	<ijlvdasdjkuxpbatycve@qzwb> <86d36jzk16.fsf@red.stonehenge.com>
	<867gwrzjwc.fsf@red.stonehenge.com> <86397fzjgi.fsf@red.stonehenge.com>
	<86y5p7y478.fsf@red.stonehenge.com> <mvldhyocegbxysykainf@hamg>
	<86d36iycca.fsf@red.stonehenge.com>
Message-Id: <biyukpzzddplzqbnlgyw@bbpf>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=interia.pl;
	s=biztos; t=1336391273;
	bh=K70/k8WpehJJZ6DmAZRTkkaPG8cEIqYRKCxb7LyWhwQ=;
	h=Date:From:Subject:To:Cc:X-Mailer:In-Reply-To:References:
	Message-Id:MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=HCLjCGEH4jseiaquJHWquV9J9vJYvDm52agLgY9OrfUlR0iF0FdZHzuFXWL++QS0B
	sC3Cp8WKSm+ZUVTrVrYFP4qBJictjns5e/GxP2jd4zSDOOjxIqaHXaaR4lc4YAKW9X
	N0jRdwwGTryUhkkw8XkHmtO29QZKBxPv0+XV4t9U=
Cc: freebsd-fs@FreeBSD.org, freebsd-questions@freebsd.org
Subject: Re: HOWTO: FreeBSD ZFS Madness (Boot Environments)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 11:48:01 -0000

> Good to see you've finally been burned.
> You'll never make that mistake again. :)

I liked that syntax:

ASD && {
  asd
} || {
  bsd
}

mostly because of syntax highlighting, to be precise highlighting
of the second bracket of a pair at editors, nor VIM neither GEANY
highlight if/then/elif/else/fi unfortunately, seems that I will have
to live with that ;p

> OK, I'll give that a try. Thanks for being persistent with me.

Did it worked?

Regards,
vermaden
--


...

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 13:03:29 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 142F11065672;
	Mon,  7 May 2012 13:03:29 +0000 (UTC)
	(envelope-from merlyn@stonehenge.com)
Received: from gw15.lax01.mailroute.net (lax-gw15.mailroute.net [199.89.0.115])
	by mx1.freebsd.org (Postfix) with ESMTP id E1DA68FC0A;
	Mon,  7 May 2012 13:03:28 +0000 (UTC)
Received: from localhost (localhost.localdomain [127.0.0.1])
	by gw15.lax01.mailroute.net (Postfix) with ESMTP id D5A41E36368;
	Mon,  7 May 2012 13:03:22 +0000 (GMT)
X-Virus-Scanned: by MailRoute
Received: from gw15.lax01.mailroute.net ([199.89.0.115])
	by localhost (gw15.lax01.mailroute.net.mailroute.net [127.0.0.1])
	(mroute_mailscanner, port 10026)
	with LMTP id TQEIdq212u-e; Mon,  7 May 2012 13:03:17 +0000 (GMT)
Received: from red.stonehenge.com (red.stonehenge.com [208.79.95.2])
	by gw15.lax01.mailroute.net (Postfix) with ESMTP id CB2ACE363B4;
	Mon,  7 May 2012 13:03:17 +0000 (GMT)
Received: by red.stonehenge.com (Postfix, from userid 1001)
	id C39831803; Mon,  7 May 2012 06:03:17 -0700 (PDT)
From: merlyn@stonehenge.com (Randal L. Schwartz)
To: vermaden <vermaden@interia.pl>
References: <pfwjphodwzuumbfqsxjn@xcww> <86ipgbg2p6.fsf@red.stonehenge.com>
	<ijlvdasdjkuxpbatycve@qzwb> <86d36jzk16.fsf@red.stonehenge.com>
	<867gwrzjwc.fsf@red.stonehenge.com>
	<86397fzjgi.fsf@red.stonehenge.com>
	<86y5p7y478.fsf@red.stonehenge.com> <mvldhyocegbxysykainf@hamg>
	<86d36iycca.fsf@red.stonehenge.com> <biyukpzzddplzqbnlgyw@bbpf>
x-mayan-date: Long count = 12.19.19.6.12; tzolkin = 10 Eb; haab = 15 Uo
Date: Mon, 07 May 2012 06:03:17 -0700
In-Reply-To: <biyukpzzddplzqbnlgyw@bbpf> (vermaden@interia.pl's message of
	"Mon, 07 May 2012 13:47:53 +0200")
Message-ID: <86ehqwb0tm.fsf@red.stonehenge.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (berkeley-unix)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-fs@FreeBSD.org, freebsd-questions@freebsd.org
Subject: Re: HOWTO: FreeBSD ZFS Madness (Boot Environments)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 13:03:29 -0000

>>>>> "vermaden" == vermaden  <vermaden@interia.pl> writes:

>> Good to see you've finally been burned.
>> You'll never make that mistake again. :)

vermaden> I liked that syntax:

vermaden> ASD && {
vermaden>   asd
vermaden> } || {
vermaden>   bsd
vermaden> }

vermaden> mostly because of syntax highlighting, to be precise highlighting
vermaden> of the second bracket of a pair at editors, nor VIM neither GEANY
vermaden> highlight if/then/elif/else/fi unfortunately, seems that I will have
vermaden> to live with that ;p

Emacs indents it nicely, and colorizes the keywords so that it stands out.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 14:05:30 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 84E46106564A;
	Mon,  7 May 2012 14:05:30 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
	[IPv6:2001:470:1f10:75::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 589578FC0C;
	Mon,  7 May 2012 14:05:30 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id C1E01B95B;
	Mon,  7 May 2012 10:05:29 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-fs@freebsd.org
Date: Mon, 7 May 2012 09:53:03 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p13; KDE/4.5.5; amd64; ; )
References: <4F8999D2.1080902@FreeBSD.org> <4FA4F36A.6030903@FreeBSD.org>
	<4FA4F883.2060008@FreeBSD.org>
In-Reply-To: <4FA4F883.2060008@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201205070953.04032.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(bigwig.baldwin.cx); Mon, 07 May 2012 10:05:29 -0400 (EDT)
Cc: freebsd-hackers@freebsd.org, Andriy Gapon <avg@freebsd.org>
Subject: Re: [review request] zfsboot/zfsloader: support accessing
	filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 14:05:30 -0000

On Saturday, May 05, 2012 5:53:07 am Andriy Gapon wrote:
> on 05/05/2012 12:31 Andriy Gapon said the following:
> > on 04/05/2012 18:25 John Baldwin said the following:
> >> On Thursday, May 03, 2012 11:23:51 am Andriy Gapon wrote:
> >>> on 03/05/2012 18:02 Andriy Gapon said the following:
> >>>>
> >>>> Here's the latest version of the patches:
> >>>> http://people.freebsd.org/~avg/zfsboot.patches.4.diff
> >>>
> >>> I've found a couple of problems in the previous version, so here's another one:
> >>> http://people.freebsd.org/~avg/zfsboot.patches.5.diff
> >>> The important change is in the first patch (__exec args).
> >>
> >> A few comments/suggestions on the args bits:
> > 
> > John,
> > 
> > these are excellent suggestions!  Thank you!
> 
> The new patchset: http://people.freebsd.org/~avg/zfsboot.patches.7.diff

Looks great, thanks!  A few replies below:

> >> - Add a CTASSERT() in loader/main.c that BI_SIZE == sizeof(struct bootinfo)
> > 
> > I have added a definition of CTASSERT to boostrap.h as it was not available for
> > sys/boot and there were two local definitions of the macro in individual files.
> > 
> > However the assertion would fail right now.
> > The backward-compatible value of BI_SIZE (72 == 0x48) covers only part of the
> > fields in struct bootinfo, those up to the following comment:
> > 	/* Items below only from advanced bootloader */
> > 
> > I am a little bit hesitant: should I increase BI_SIZE to cover the whole struct
> > bootinfo or should I compare BI_SIZE to offsetof bi_kernend?

Actually, we should probably be reading the 'bi_size' field and not using a BI_SIZE
constant at all?

Looks like only the non-functional EFI boot loader doesn't set bi_size (and it should
just be fixed to do so since it needs to pass new fields in anyway).

> > I've decided to define ARGADJ in the new common header, then I've had to rename
> > btxcsu.s to .S, so that the preprocessing is executed for it.

Ok.  Maybe add one comment to the bootargs.h head to explain that the 'bootargs'
struct starts at ARGOFF and can grow up, while struct bootinfo is copied such that
it's end is at the top of the argument area and grows down.

Also, at some point we could use a genassym.c file ala the kernel builds to generate
some of the constants in bootargs.h instead (e.g. the offsets of fields within
structures, and BA_SIZE, though we probably want to ensure that BA_SIZE never
changes).

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 14:35:56 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 667BF106564A;
	Mon,  7 May 2012 14:35:56 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 0C2B08FC0C;
	Mon,  7 May 2012 14:35:54 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA06392;
	Mon, 07 May 2012 17:34:37 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Message-ID: <4FA7DD7C.4070703@FreeBSD.org>
Date: Mon, 07 May 2012 17:34:36 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: Bruce Evans <brde@optusnet.com.au>
References: <4F8999D2.1080902@FreeBSD.org> <4FA29E1B.7040005@FreeBSD.org>
	<4FA2A307.2090108@FreeBSD.org> <201205041125.15155.jhb@freebsd.org>
	<4FA4F36A.6030903@FreeBSD.org>
	<20120505194459.D1295@besplex.bde.org>
In-Reply-To: <20120505194459.D1295@besplex.bde.org>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org,
	John Baldwin <jhb@FreeBSD.org>
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 14:35:56 -0000

on 05/05/2012 13:49 Bruce Evans said the following:
> On Sat, 5 May 2012, Andriy Gapon wrote:
> 
>> on 04/05/2012 18:25 John Baldwin said the following:
>>> On Thursday, May 03, 2012 11:23:51 am Andriy Gapon wrote:
>>>> on 03/05/2012 18:02 Andriy Gapon said the following:
>>>>>
>>>>> Here's the latest version of the patches:
>>>>> http://people.freebsd.org/~avg/zfsboot.patches.4.diff
>>>>
>>>> I've found a couple of problems in the previous version, so here's another one:
>>>> http://people.freebsd.org/~avg/zfsboot.patches.5.diff
>>>> The important change is in the first patch (__exec args).
>>>
>>> A few comments/suggestions on the args bits:
>>
>> John,
>>
>> these are excellent suggestions!  Thank you!
>> Some comments:
>>> - Add #ifndef LOCORE guards to the new header around the structure so
>>>   it can be used in assembly as well as C.
>>
>> Done.  I have had to go into a few btx makefiles and add a necessary include
>> path and -DLOCORE to make the header usable from asm.

Bruce,

first a note that the change that we discussed affects (should affect) only BTX
code and as such only boot1/2 -> loader interface.

> Ugh, why not use genassym, as is done for all old uses of this header in
> locore.s, at least on i386 (5% of the i386 genassym.c is for this).

Can not parse 'this header' in this context.  We were talking about a new header
file, so there could not be any old uses of it :-)
Probably you meant sys/i386/include/bootinfo.h ?
But, as you say later, it's probably not easy to use genassym with sys/boot
code.  Not sure if it would be worth while going this path given the possible
alternatives.

>>> - Move BI_SIZE and ARGOFF into the header as constants.
>>
>> Done.
>>
>>> - Add a CTASSERT() in loader/main.c that BI_SIZE == sizeof(struct bootinfo)
> 
> Ugh, BI_SIZE was already used in locore.s.

OK, but this is "the other" BI_SIZE.  Maybe the name clash is not nice indeed,
though.

> It wasn't the size of the struct,
> but was the offset of the field that gives the size.  No CTASSERT() was
> needed -- the size is whatever it is, as given by sizeof() on the struct
> at the time of compilation of the utility that initializes the struct.
> It was a feature that sizeof() and offsetof() can't be used in asm so they
> must be translated in genassym and no macros are needed in the header (the
> size was fully dynamic, so the asm code only needs the offsetof() values).
> Of course, you could use CTASSERT()s to check that the struct layout didn't
> get broken.  The old code just assumes that the struct is packed by the
> programmer and that the arch's struct packing conventions don't change,
> so that for example BI_SIZE = offsetof(struct bootinfo, bi_size) never
> changes.

It seems that boot1/2 -> kernel interface and boo1/2 -> {btxldr, btx} -> loader
interfaces are quite independent and a bit different.

> genassym is hard to use in boot programs, but the old design was that
> boot programs shouldn't use bootinfo in asm and should just use the
> target bootinfo.h at compile time (whatever time the target is compiled).

I am not sure if it is worthwhile adapting genassym to sys/boot...
BTX code needs to know only "some size" of bootinfo.  Although it doesn't look
like boot1/2 passes anything really useful to loader via bootinfo except for
bi_bios_dev.  For that matter it looks like maybe only two fields from the whole
(x86) bootinfo are useful to (x86) kernel either...

> Anyway, LOCORE means "for use in locore.[sS]", so other uses of it, e.g.
> in boot programs, are bogus.

That's a good point.  Maybe we should use some more generic name.  Maybe there
is even some macro that is always set for .S files that we can check.  Oh, thank
google, is __ASSEMBLER__ it?
It seems like couple of non-x86 headers already use this macro.

>> I have added a definition of CTASSERT to boostrap.h as it was not available for
>> sys/boot and there were two local definitions of the macro in individual files.
>>
>> However the assertion would fail right now.
>> The backward-compatible value of BI_SIZE (72 == 0x48) covers only part of the
> 
> This isn't backwards compatible.  BI_SIZE was decimal 48 (covers everything
> up to the bi_size field).

I meant backward compatible with the BTX code that I was changing, of course.

>> fields in struct bootinfo, those up to the following comment:
>>     /* Items below only from advanced bootloader */
>> I am a little bit hesitant: should I increase BI_SIZE to cover the whole struct
>> bootinfo or should I compare BI_SIZE to offsetof bi_kernend?
> 
> Neither.  BI_SIZE shouldn't exist.  It defeats the bi_size field.

Using the bi_size field may be the proper solution indeed.  Even if no data
beyond certain offset is ever used by loader.  The planned changes to BTX code
should make using bi_size easier.

>>> Maybe
>>>   create a 'struct kargs_ext' that looks like this:
>>>
>>> struct kargs_ext {
>>>     uint32_t size;
>>>     char data[0];
>>> };
>>
>> I've decided to skip on this.
> 
> Use KNF indentation and KNF field prefixes (ka_) if you add it :-).  Generic
> field names like `size' and `data' need prefixes more than mos.
> 
> The old struct was:
> 
> % #define    N_BIOS_GEOM        8
> % ...
> % /*
> %  * A zero bootinfo field often means that there is no info available.
> %  * Flags are used to indicate the validity of fields where zero is a
> %  * normal value.
> %  */
> % struct bootinfo {
> %     u_int32_t    bi_version;
> %     u_int32_t    bi_kernelname;        /* represents a char * */
> %     u_int32_t    bi_nfs_diskless;    /* struct nfs_diskless * */
> %                 /* End of fields that are always present. */
> 
> The original size was apparently 12.
> 
> % #define    bi_endcommon    bi_n_bios_used
> 
> Another style difference.  The magic 12 is essentially given by this macro.
> This macro is a pseudo-field, like the ones for the copyable and zeroable
> regions in struct proc.  Its name is in lower case.
> 
> %     u_int32_t    bi_n_bios_used;
> %     u_int32_t    bi_bios_geom[N_BIOS_GEOM];
> 
> The struct was broken in 1994 by adding the above 2 fields without providing
> any way to distinguish it from the old struct.
> 
> %     u_int32_t    bi_size;
> %     u_int8_t    bi_memsizes_valid;
> %     u_int8_t    bi_bios_dev;        /* bootdev BIOS unit number */
> %     u_int8_t    bi_pad[2];
> %     u_int32_t    bi_basemem;
> %     u_int32_t    bi_extmem;
> %     u_int32_t    bi_symtab;        /* struct symtab * */
> %     u_int32_t    bi_esymtab;        /* struct symtab * */
> 
> The above 8 fields were added in 1995 (together with fixing style bugs
> like no prefixes for the old field names).  Now the struct is determined
> by its size according to the bi_size field, and the bi_version field is
> not really used (it's much easier to add stuff to the end than to support
> multiple versions).  This gives a range of old sizes/versions:
> 
> 12: ~1993 (FreeBSD-1)
> 48: ~1994 (FreeBSD-1 and/or 2)
> 0x48: FreeBSD-2 post-1995
> 
> But these old sizes are uninteresting since only boot loaders from before
> 1993-1995 support only the above fields, and these loaders can't boot current
> kernels.
> 
> %                 /* Items below only from advanced bootloader */
> %     u_int32_t    bi_kernend;        /* end of kernel space */
> %     u_int32_t    bi_envp;        /* environment */
> %     u_int32_t    bi_modulep;        /* preloaded modules */
> 
> Added in 1998.  Still uninteresting, since boot loaders newer than that
> are needed to boot current kernels (mainly for elf).
> 
> %     uint64_t    bi_hcdp;        /* DIG64 HCDP table */
> %     uint64_t    bi_fpswa;        /* FPSWA interface */
> %     uint64_t    bi_systab;        /* pa of EFI system table */
> %     uint64_t    bi_memmap;        /* pa of EFI memory map */
> %     uint64_t    bi_memmap_size;        /* size of EFI memory map */
> %     uint64_t    bi_memdesc_size;    /* sizeof EFI memory desc */
> %     uint32_t    bi_memdesc_version;    /* EFI memory desc version */
> 
> Added in 2010.  Are all of these uint64_t types correct?  The padding seems
> to be broken, so that these fields would not work for amd64: we're at offset
> 0x48 for bi_kernend.  The 3 uint32_t's added in 1998 reach 0x54.  Then all
> the uint64_t fields are misaligned on i386, and on amd64 there is unnamed
> padding before the first of them to align them.  But and64 doesn't use
> bootinfo.h in the kernel, so I think only the i386 version is used on amd64
> (in the boot loader), so the misaligned case isn't used.

Interesting observations.  It looks like these newest fields were ported from
IA64 for EFI support, but it doesn't look like that support is actually in x86 yet.

> The struct declaration is also broken at the end.  The last field is 32 bits,
> so there is unnamed padding after it on amd64 only.  This padding should be
> explicit, like the padding before the uint64_t fields, or just put the 32-bit
> field before the 64-bit fields.
> 
> % };
> 
> So apart you could hard-code the size to the 1998 value of 0x54 without
> losing anything except the buggy 2010 fields.  But it shouldn't be
> hard-coded.

I am inclined to agree.

Thank you again.

P.S.  Actually I feel like arguing if the genassym approach is totally
correct/safe for BI_SIZE.  One could easily insert a field before bi_size and
thus change BI_SIZE and thus break compatibility with binaries compiled before
the change.  And all that without getting any hint during compilation.
OTOH, if BI_SIZE is explicitly defined to constant and there is a CTASSERT to
assert that BI_SIZE ==  offsetof(..., bi_size), then the chances of unwittingly
breaking things are smaller.
Of course, something like this would never happen in reality.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 14:47:09 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D8898106567A;
	Mon,  7 May 2012 14:47:09 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id C12CE8FC14;
	Mon,  7 May 2012 14:47:08 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA06503;
	Mon, 07 May 2012 17:47:07 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Message-ID: <4FA7E069.8020208@FreeBSD.org>
Date: Mon, 07 May 2012 17:47:05 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@FreeBSD.org>
References: <4F8999D2.1080902@FreeBSD.org> <4FA4F36A.6030903@FreeBSD.org>
	<4FA4F883.2060008@FreeBSD.org> <201205070953.04032.jhb@freebsd.org>
In-Reply-To: <201205070953.04032.jhb@freebsd.org>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 14:47:10 -0000

on 07/05/2012 16:53 John Baldwin said the following:
> On Saturday, May 05, 2012 5:53:07 am Andriy Gapon wrote:
[snip]
>> The new patchset: http://people.freebsd.org/~avg/zfsboot.patches.7.diff
> 
> Looks great, thanks!  A few replies below:

Here's a followup patch for the suggestions:
http://people.freebsd.org/~avg/bootargs.followup.diff
I will merge it into the main patch.

What do you think about the -LOCORE- change that Bruce inspired?

>>>> - Add a CTASSERT() in loader/main.c that BI_SIZE == sizeof(struct bootinfo)
>>>
>>> I have added a definition of CTASSERT to boostrap.h as it was not available for
>>> sys/boot and there were two local definitions of the macro in individual files.
>>>
>>> However the assertion would fail right now.
>>> The backward-compatible value of BI_SIZE (72 == 0x48) covers only part of the
>>> fields in struct bootinfo, those up to the following comment:
>>> 	/* Items below only from advanced bootloader */
>>>
>>> I am a little bit hesitant: should I increase BI_SIZE to cover the whole struct
>>> bootinfo or should I compare BI_SIZE to offsetof bi_kernend?
> 
> Actually, we should probably be reading the 'bi_size' field and not using a BI_SIZE
> constant at all?

Done in the above patch.

> Looks like only the non-functional EFI boot loader doesn't set bi_size (and it should
> just be fixed to do so since it needs to pass new fields in anyway).
> 
>>> I've decided to define ARGADJ in the new common header, then I've had to rename
>>> btxcsu.s to .S, so that the preprocessing is executed for it.
> 
> Ok.  Maybe add one comment to the bootargs.h head to explain that the 'bootargs'
> struct starts at ARGOFF and can grow up, while struct bootinfo is copied such that
> it's end is at the top of the argument area and grows down.

Will do.

> Also, at some point we could use a genassym.c file ala the kernel builds to generate
> some of the constants in bootargs.h instead (e.g. the offsets of fields within
> structures, and BA_SIZE, though we probably want to ensure that BA_SIZE never
> changes).

The genassym approach sounds good, but, indeed - later :)

Thank you.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 15:15:57 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id ABA39106566C;
	Mon,  7 May 2012 15:15:57 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 92CE08FC16;
	Mon,  7 May 2012 15:15:56 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id SAA06720;
	Mon, 07 May 2012 18:15:54 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Message-ID: <4FA7E729.4000308@FreeBSD.org>
Date: Mon, 07 May 2012 18:15:53 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@FreeBSD.org>
References: <4F8999D2.1080902@FreeBSD.org> <4FA4F36A.6030903@FreeBSD.org>
	<4FA4F883.2060008@FreeBSD.org> <201205070953.04032.jhb@freebsd.org>
	<4FA7E069.8020208@FreeBSD.org>
In-Reply-To: <4FA7E069.8020208@FreeBSD.org>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 15:15:57 -0000

on 07/05/2012 17:47 Andriy Gapon said the following:
> on 07/05/2012 16:53 John Baldwin said the following:
>> Ok.  Maybe add one comment to the bootargs.h head to explain that the 'bootargs'
>> struct starts at ARGOFF and can grow up, while struct bootinfo is copied such that
>> it's end is at the top of the argument area and grows down.
> 
> Will do.

Could you please check the wording and correct it or suggest alternatives?
Thank you.

diff --git a/sys/boot/i386/common/bootargs.h b/sys/boot/i386/common/bootargs.h
index 510efdd..8bc1b32 100644
--- a/sys/boot/i386/common/bootargs.h
+++ b/sys/boot/i386/common/bootargs.h
@@ -29,6 +29,15 @@
 #define	BF_OFF		8	/* offsetof(struct bootargs, bootflags) */
 #define	BI_OFF		20	/* offsetof(struct bootargs, bootinfo) */

+/*
+ * We reserve some space above BTX allocated stack for the arguments
+ * and certain data that could hang off them.  Currently only struct bootinfo
+ * is supported in that category.  The bootinfo is placed at the top
+ * of the arguments area and the actual arguments are placed at ARGOFF offset
+ * from the top and grow towards the top.  Hopefully we have enough space
+ * for bootinfo and the arguments to not run into each other.
+ * Arguments area below ARGOFF is reserved for future use.
+ */
 #define	ARGSPACE	0x1000	/* total size of the BTX args area */
 #define	ARGOFF		0x800	/* actual args offset within the args area */
 #define	ARGADJ		(ARGSPACE - ARGOFF)


-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 17:46:12 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2A83D1065670;
	Mon,  7 May 2012 17:46:12 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
	[IPv6:2001:470:1f10:75::2])
	by mx1.freebsd.org (Postfix) with ESMTP id F217C8FC14;
	Mon,  7 May 2012 17:46:11 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6F03EB95D;
	Mon,  7 May 2012 13:46:11 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Andriy Gapon <avg@freebsd.org>
Date: Mon, 7 May 2012 13:38:12 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p13; KDE/4.5.5; amd64; ; )
References: <4F8999D2.1080902@FreeBSD.org> <4FA7E069.8020208@FreeBSD.org>
	<4FA7E729.4000308@FreeBSD.org>
In-Reply-To: <4FA7E729.4000308@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201205071338.12807.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(bigwig.baldwin.cx); Mon, 07 May 2012 13:46:11 -0400 (EDT)
Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
	filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 17:46:12 -0000

On Monday, May 07, 2012 11:15:53 am Andriy Gapon wrote:
> on 07/05/2012 17:47 Andriy Gapon said the following:
> > on 07/05/2012 16:53 John Baldwin said the following:
> >> Ok.  Maybe add one comment to the bootargs.h head to explain that the 'bootargs'
> >> struct starts at ARGOFF and can grow up, while struct bootinfo is copied such that
> >> it's end is at the top of the argument area and grows down.
> > 
> > Will do.
> 
> Could you please check the wording and correct it or suggest alternatives?
> Thank you.
> 
> diff --git a/sys/boot/i386/common/bootargs.h b/sys/boot/i386/common/bootargs.h
> index 510efdd..8bc1b32 100644
> --- a/sys/boot/i386/common/bootargs.h
> +++ b/sys/boot/i386/common/bootargs.h
> @@ -29,6 +29,15 @@
>  #define	BF_OFF		8	/* offsetof(struct bootargs, bootflags) */
>  #define	BI_OFF		20	/* offsetof(struct bootargs, bootinfo) */
> 
> +/*
> + * We reserve some space above BTX allocated stack for the arguments
> + * and certain data that could hang off them.  Currently only struct bootinfo
> + * is supported in that category.  The bootinfo is placed at the top
> + * of the arguments area and the actual arguments are placed at ARGOFF offset
> + * from the top and grow towards the top.  Hopefully we have enough space
> + * for bootinfo and the arguments to not run into each other.
> + * Arguments area below ARGOFF is reserved for future use.
> + */
>  #define	ARGSPACE	0x1000	/* total size of the BTX args area */
>  #define	ARGOFF		0x800	/* actual args offset within the args area */
>  #define	ARGADJ		(ARGSPACE - ARGOFF)

I think this is good, thanks!

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 17:46:12 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id ED4A5106566B;
	Mon,  7 May 2012 17:46:12 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
	[IPv6:2001:470:1f10:75::2])
	by mx1.freebsd.org (Postfix) with ESMTP id C18788FC15;
	Mon,  7 May 2012 17:46:12 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 27BB1B978;
	Mon,  7 May 2012 13:46:12 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Andriy Gapon <avg@freebsd.org>
Date: Mon, 7 May 2012 13:43:45 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p13; KDE/4.5.5; amd64; ; )
References: <4F8999D2.1080902@FreeBSD.org> <201205070953.04032.jhb@freebsd.org>
	<4FA7E069.8020208@FreeBSD.org>
In-Reply-To: <4FA7E069.8020208@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201205071343.45955.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(bigwig.baldwin.cx); Mon, 07 May 2012 13:46:12 -0400 (EDT)
Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
	filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 17:46:13 -0000

On Monday, May 07, 2012 10:47:05 am Andriy Gapon wrote:
> on 07/05/2012 16:53 John Baldwin said the following:
> > On Saturday, May 05, 2012 5:53:07 am Andriy Gapon wrote:
> [snip]
> >> The new patchset: http://people.freebsd.org/~avg/zfsboot.patches.7.diff
> > 
> > Looks great, thanks!  A few replies below:
> 
> Here's a followup patch for the suggestions:
> http://people.freebsd.org/~avg/bootargs.followup.diff
> I will merge it into the main patch.
> 
> What do you think about the -LOCORE- change that Bruce inspired?

In general I think this looks good.  I have only one suggestion.  In other
code (e.g. the genassym constants in the kernel) where we define constants
for field offsets, we make the constant be the uppercase name of the field
itself (e.g. TD_PCB for offsetof(struct thread, td_pcb)).  I would rather
do that here as well.  In this case the field names do not have a prefix,
but let's just use a BA_ prefix for members of 'bootargs'.  BI_SIZE is
already correct, but this would mean renaming HT_OFF to BA_HOWTO, BF_OFF to
BA_BOOTFLAGS, and BI_OFF to BA_BOOTINFO.  I think you can probably leave
BA_SIZE as-is.

> > Also, at some point we could use a genassym.c file ala the kernel builds to generate
> > some of the constants in bootargs.h instead (e.g. the offsets of fields within
> > structures, and BA_SIZE, though we probably want to ensure that BA_SIZE never
> > changes).
> 
> The genassym approach sounds good, but, indeed - later :)

Yes, that can wait.  I think it would not be very hard to do however.  All
you really need is access to sys/kern/genassym.sh and nm.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 17:48:22 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3A138106566C
	for <freebsd-fs@freebsd.org>; Mon,  7 May 2012 17:48:22 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id A640D8FC1A
	for <freebsd-fs@freebsd.org>; Mon,  7 May 2012 17:48:21 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1SRS2f-0006I1-Qx; Mon, 07 May 2012 20:48:13 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id B0C111CC31; Mon,  7 May 2012 20:48:13 +0300 (EEST)
Date: Mon, 7 May 2012 20:48:13 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20120507174813.GA5927@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <CA+QLa9B4Xxc-4pCo8y4pgU1BBoBvC2xG4vA3Kydr-Q2dXWRpNw@mail.gmail.com>
	<1494135294.103829.1335731763653.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1494135294.103829.1335731763653.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 28-Apr-2011 07:11:12)
X-Date: 2012-05-07 20:48:13
X-Connected-IP: 10.18.52.101:29819
X-Message-Linecount: 62
X-Body-Linecount: 46
X-Message-Size: 2847
X-Body-Size: 2070
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 17:48:22 -0000

On Sun, Apr 29, 2012 at 04:36:03PM -0400, Rick Macklem wrote:
> 
> Also, be sure to check "man nfsv4" and maybe reference it (it is currently
> in the See Also list, but that might not be strong enough).

There is another question not explained in documentation (I could not
find the answer at least).  Currently NFSv3 client uses reserved port
for NFS mounts and uses non reserved port if "noresvport" is specified.
NFSv4 client always uses non reserved port, ignoring the "resvport"
option in the mount_nfs command.

Such behaviour of NFS client was introduced in 1.18 version of
fs/nfsclient/nfs_clvfsops.c [1], where the "resvport" flag is cleared
for NFSv4 mounts.

Why does "reserved port logic" differ in NFSv3 and NFSv4 clients?

> > If I may, perhaps a switch in /etc/rc.conf:
> > nfsv4_only="YES"
> > 
> I might call it nfsv4_server_only, but sounds like a good suggestion.
> 
> > This would set the -nfsv4-only switch you mention for mountd, and it
> > would set vfs.nfsd.server_min_nfsvers=4
> > 
> It could also be used by /etc/rc.d/mountd to indicate "don't force rpcbind".
> 

I'm sure that you know all these, but let me add some comments.

1. Using server_min_nfsvers and server_max_nfsvers are global settings
   and do not allow to make one file system NFSv2/3/4 exported and another
   one NFSv4 exported only for example.

2. MOUNT protocol is not used only for MNT/UMNT/UMNTALL requests from
   NFSv2/3 clients.  As I know some automounters use MOUNT EXPORT requests
   to get information about exported file systems.  So, MOUNT protocol
   can be usefull for somebody who uses NFSv4 only.

Both items have something common.  There should be options that enable/
disables NFSv2, NFSv3 and/or NFSv4 per address specification and/or per
file system.  And there should be the option that allows to disable the
MOUNT protocol entirely, some of its versions or some of its netconfigs
(some of visible netconfigs that can be used by the MOUNT protocol).

[1] http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/fs/nfsclient/nfs_clvfsops.c.diff?r1=1.17;r2=1.18

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 18:49:21 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0233B106566B;
	Mon,  7 May 2012 18:49:21 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id CAB5F8FC0C;
	Mon,  7 May 2012 18:49:20 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q47InKlr008282;
	Mon, 7 May 2012 18:49:20 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q47InKbk008278;
	Mon, 7 May 2012 18:49:20 GMT (envelope-from linimon)
Date: Mon, 7 May 2012 18:49:20 GMT
Message-Id: <201205071849.q47InKbk008278@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown / reboot
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 18:49:21 -0000

Old Synopsis: ZFS on USB drive prevents shutdown / reboot
New Synopsis: [zfs] ZFS on USB drive prevents shutdown / reboot

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Mon May 7 18:49:08 UTC 2012
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=167685

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 18:49:45 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id E9B48106566C;
	Mon,  7 May 2012 18:49:45 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id BE1648FC08;
	Mon,  7 May 2012 18:49:45 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q47InjUH008368;
	Mon, 7 May 2012 18:49:45 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q47InjHP008364;
	Mon, 7 May 2012 18:49:45 GMT (envelope-from linimon)
Date: Mon, 7 May 2012 18:49:45 GMT
Message-Id: <201205071849.q47InjHP008364@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/167688: [fusefs] Incorrect signal handling with direct_io
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 18:49:46 -0000

Old Synopsis: fusefs. Incorrect signal handling with direct_io
New Synopsis: [fusefs] Incorrect signal handling with direct_io

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Mon May 7 18:49:28 UTC 2012
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=167688

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 18:55:20 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 999381065679;
	Mon,  7 May 2012 18:55:20 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 6C8418FC17;
	Mon,  7 May 2012 18:55:20 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q47ItFft017488;
	Mon, 7 May 2012 18:55:15 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q47ItFfJ017484;
	Mon, 7 May 2012 18:55:15 GMT (envelope-from linimon)
Date: Mon, 7 May 2012 18:55:15 GMT
Message-Id: <201205071855.q47ItFfJ017484@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/167612: [portalfs] The portal file system gets stuck
	inside portal_open(). ("1 extra fds")
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 18:55:20 -0000

Old Synopsis: The portal file system gets stuck inside portal_open(). ("1 extra fds")
New Synopsis: [portalfs] The portal file system gets stuck inside portal_open(). ("1 extra fds")

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Mon May 7 18:55:01 UTC 2012
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=167612

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 21:57:53 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id EC1C91065670;
	Mon,  7 May 2012 21:57:53 +0000 (UTC)
	(envelope-from vermaden@interia.pl)
Received: from smtpo.poczta.interia.pl (smtpo.poczta.interia.pl
	[217.74.65.208])
	by mx1.freebsd.org (Postfix) with ESMTP id 9FAA28FC08;
	Mon,  7 May 2012 21:57:53 +0000 (UTC)
Date: Mon, 07 May 2012 23:57:52 +0200
From: vermaden <vermaden@interia.pl>
To: "Randal L. Schwartz" <merlyn@stonehenge.com>
X-Mailer: interia.pl/pf09
In-Reply-To: <86ehqwb0tm.fsf@red.stonehenge.com>
References: <pfwjphodwzuumbfqsxjn@xcww> <86ipgbg2p6.fsf@red.stonehenge.com>
	<ijlvdasdjkuxpbatycve@qzwb> <86d36jzk16.fsf@red.stonehenge.com>
	<867gwrzjwc.fsf@red.stonehenge.com> <86397fzjgi.fsf@red.stonehenge.com>
	<86y5p7y478.fsf@red.stonehenge.com> <mvldhyocegbxysykainf@hamg>
	<86d36iycca.fsf@red.stonehenge.com> <biyukpzzddplzqbnlgyw@bbpf>
	<86ehqwb0tm.fsf@red.stonehenge.com>
Message-Id: <cmpuwpuumkewkvzpaknh@wvlx>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=interia.pl;
	s=biztos; t=1336427872;
	bh=bJUG9nepGkMwbHtCHmdnoMdYGI8hjURrYTV5Pto3tD0=;
	h=Date:From:Subject:To:Cc:X-Mailer:In-Reply-To:References:
	Message-Id:MIME-Version:Content-Type:Content-Transfer-Encoding;
	b=tU5rAowv7pbgQf51z/ON58Xzqva0F3iw9yOBWd4RbhicAqfdy3hWFvD5K/C3P6Sux
	q4WykTAgpAVqveFLHyRaYbMNHWHWFI8FH3fSN3ifsouK8jg4HvU/5Ou0xVS+hgxFmt
	g29XNxnAAV884hCNzchqnYI4nmBygG/U9Pl8qtbc=
Cc: freebsd-fs@FreeBSD.org, freebsd-questions@freebsd.org
Subject: Re: HOWTO: FreeBSD ZFS Madness (Boot Environments)
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 21:57:54 -0000

> Emacs indents it nicely, and colorizes the
> keywords so that it stands out.

Indentification is not a problem, it work both
in geany and vim.

Probably I haven't made clear what I meant ;)

Take a look at this picture:
http://ompldr.org/vZG50bQ

The brackets in that specific section (asd) are
highlighted, other are not, its not possible with
if/then/fi, only the keywords are highlighted,
but they are highlighted for the whole script so ... ;)

With { } I can also (un)fold the section/function,
its not possible with if/then/fi.

Regards,
vermaden
--=20


...

From owner-freebsd-fs@FreeBSD.ORG  Mon May  7 23:40:25 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id A0ADE106564A
	for <freebsd-fs@freebsd.org>; Mon,  7 May 2012 23:40:25 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 5A73B8FC0A
	for <freebsd-fs@freebsd.org>; Mon,  7 May 2012 23:40:25 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap4EAEdcqE+DaFvO/2dsb2JhbABEhXKuPYIMAQEEASNWGw4KAgINGQJZBhyIAAULqA6Se4EviVCEcYEYBJV+kEKDBQ
X-IronPort-AV: E=Sophos;i="4.75,546,1330923600"; d="scan'208";a="168310392"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 07 May 2012 19:40:18 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 215D1B4031;
	Mon,  7 May 2012 19:40:18 -0400 (EDT)
Date: Mon, 7 May 2012 19:40:18 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Message-ID: <1357768784.50127.1336434018113.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20120507174813.GA5927@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 May 2012 23:40:25 -0000

Andrey Simonenko wrote:
> On Sun, Apr 29, 2012 at 04:36:03PM -0400, Rick Macklem wrote:
> >
> > Also, be sure to check "man nfsv4" and maybe reference it (it is
> > currently
> > in the See Also list, but that might not be strong enough).
> 
> There is another question not explained in documentation (I could not
> find the answer at least). Currently NFSv3 client uses reserved port
> for NFS mounts and uses non reserved port if "noresvport" is
> specified.
> NFSv4 client always uses non reserved port, ignoring the "resvport"
> option in the mount_nfs command.
> 
> Such behaviour of NFS client was introduced in 1.18 version of
> fs/nfsclient/nfs_clvfsops.c [1], where the "resvport" flag is cleared
> for NFSv4 mounts.
> 
> Why does "reserved port logic" differ in NFSv3 and NFSv4 clients?
> 
It is my understanding that NFSv4 servers are not supposed to require
a "reserved" port#. However, at a quick glance, I can't find that stated
in RFC 3530. (It may be implied by the fact that NFSv4 uses a "user" based
security model and not a "host" based one.)

As such, the client should never need to "waste" a reserved port# on a NFSv4
connection.

rick

> > > If I may, perhaps a switch in /etc/rc.conf:
> > > nfsv4_only="YES"
> > >
> > I might call it nfsv4_server_only, but sounds like a good
> > suggestion.
> >
> > > This would set the -nfsv4-only switch you mention for mountd, and
> > > it
> > > would set vfs.nfsd.server_min_nfsvers=4
> > >
> > It could also be used by /etc/rc.d/mountd to indicate "don't force
> > rpcbind".
> >
> 
> I'm sure that you know all these, but let me add some comments.
> 
> 1. Using server_min_nfsvers and server_max_nfsvers are global settings
> and do not allow to make one file system NFSv2/3/4 exported and
> another
> one NFSv4 exported only for example.
> 
> 2. MOUNT protocol is not used only for MNT/UMNT/UMNTALL requests from
> NFSv2/3 clients. As I know some automounters use MOUNT EXPORT requests
> to get information about exported file systems. So, MOUNT protocol
> can be usefull for somebody who uses NFSv4 only.
> 
> Both items have something common. There should be options that enable/
> disables NFSv2, NFSv3 and/or NFSv4 per address specification and/or
> per
> file system. And there should be the option that allows to disable the
> MOUNT protocol entirely, some of its versions or some of its
> netconfigs
> (some of visible netconfigs that can be used by the MOUNT protocol).
> 
> [1]
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/fs/nfsclient/nfs_clvfsops.c.diff?r1=1.17;r2=1.18

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 01:35:34 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1B9A81065670
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 01:35:33 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id 6E49E8FC0C
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 01:35:33 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id
	q481ZPbK023829; Mon, 7 May 2012 20:35:26 -0500 (CDT)
Date: Mon, 7 May 2012 20:35:25 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Rick Macklem <rmacklem@uoguelph.ca>
In-Reply-To: <1357768784.50127.1336434018113.JavaMail.root@erie.cs.uoguelph.ca>
Message-ID: <alpine.GSO.2.01.1205072034320.1678@freddy.simplesystems.org>
References: <1357768784.50127.1336434018113.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Mon, 07 May 2012 20:35:26 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 01:35:34 -0000

On Mon, 7 May 2012, Rick Macklem wrote:
>>
> It is my understanding that NFSv4 servers are not supposed to require
> a "reserved" port#. However, at a quick glance, I can't find that stated
> in RFC 3530. (It may be implied by the fact that NFSv4 uses a "user" based
> security model and not a "host" based one.)
>
> As such, the client should never need to "waste" a reserved port# on a NFSv4
> connection.

Firewalls might use the reserved port as part of a filtering 
algorithm.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 07:14:48 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 92C36106564A;
	Tue,  8 May 2012 07:14:48 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 756D58FC0C;
	Tue,  8 May 2012 07:14:47 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA12978;
	Tue, 08 May 2012 10:14:45 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
	by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1SRedA-000K6X-JS; Tue, 08 May 2012 10:14:44 +0300
Message-ID: <4FA8C7E3.8070006@FreeBSD.org>
Date: Tue, 08 May 2012 10:14:43 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@FreeBSD.org>
References: <4F8999D2.1080902@FreeBSD.org> <201205070953.04032.jhb@freebsd.org>
	<4FA7E069.8020208@FreeBSD.org> <201205071343.45955.jhb@freebsd.org>
In-Reply-To: <201205071343.45955.jhb@freebsd.org>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 07:14:48 -0000

on 07/05/2012 20:43 John Baldwin said the following:
> On Monday, May 07, 2012 10:47:05 am Andriy Gapon wrote:
>> on 07/05/2012 16:53 John Baldwin said the following:
[snip]
>> What do you think about the -LOCORE- change that Bruce inspired?
> 
> In general I think this looks good.  I have only one suggestion.  In other
> code (e.g. the genassym constants in the kernel) where we define constants
> for field offsets, we make the constant be the uppercase name of the field
> itself (e.g. TD_PCB for offsetof(struct thread, td_pcb)).  I would rather
> do that here as well.  In this case the field names do not have a prefix,
> but let's just use a BA_ prefix for members of 'bootargs'.  BI_SIZE is
> already correct, but this would mean renaming HT_OFF to BA_HOWTO, BF_OFF to
> BA_BOOTFLAGS, and BI_OFF to BA_BOOTINFO.

OK, doing this.

> I think you can probably leave BA_SIZE as-is.

I see that i386 genassym has a few different styles for sizeof constants:
ABBRSIZE
FULL_NAME_SIZE
ABBR_SIZEOF

FULL_NAME_SIZE looked the most appealing to me (and seems to be the most
common), so I decided to change BA_SIZE to BOOTARGS_SIZE.
I hope that this makes sense and I am not starting a bikeshed :-)

>>> Also, at some point we could use a genassym.c file ala the kernel builds to generate
>>> some of the constants in bootargs.h instead (e.g. the offsets of fields within
>>> structures, and BA_SIZE, though we probably want to ensure that BA_SIZE never
>>> changes).
>>
>> The genassym approach sounds good, but, indeed - later :)
> 
> Yes, that can wait.  I think it would not be very hard to do however.  All
> you really need is access to sys/kern/genassym.sh and nm.
> 


-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 11:13:48 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7455F106566B
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 11:13:48 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 2E51E8FC16
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 11:13:48 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap4EAKP+qE+DaFvO/2dsb2JhbABEhXKuOoIMAQEEASNWBRYOCgICDRkCWQYTCYgABQundJMggS+JVIRxgRgElX6QQoMF
X-IronPort-AV: E=Sophos;i="4.75,550,1330923600"; d="scan'208";a="171082730"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 08 May 2012 07:13:47 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 12CCAB3EFE;
	Tue,  8 May 2012 07:13:47 -0400 (EDT)
Date: Tue, 8 May 2012 07:13:47 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
Message-ID: <1387389132.59565.1336475627040.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <alpine.GSO.2.01.1205072034320.1678@freddy.simplesystems.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 11:13:48 -0000

Bob Friesenhahn wrote:
> On Mon, 7 May 2012, Rick Macklem wrote:
> >>
> > It is my understanding that NFSv4 servers are not supposed to
> > require
> > a "reserved" port#. However, at a quick glance, I can't find that
> > stated
> > in RFC 3530. (It may be implied by the fact that NFSv4 uses a "user"
> > based
> > security model and not a "host" based one.)
> >
> > As such, the client should never need to "waste" a reserved port# on
> > a NFSv4
> > connection.
> 
> Firewalls might use the reserved port as part of a filtering
> algorithm.
> 
Hmm, since the IETF working group was determined to "get rid of this
bunk w.r.t. reserved port #s being used to enhance security", I might
argue that said firewalls were misconfigured/broken.

However, I can see an argument that, instead of silently ignoring the
option, it should be obeyed, but with a note in the man page that it
shouldn't be used for NFSv4.

rick

> Bob
> --
> Bob Friesenhahn
> bfriesen@simple.dallas.tx.us,
> http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer, http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 14:15:38 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 17371106566B;
	Tue,  8 May 2012 14:15:38 +0000 (UTC) (envelope-from jhb@FreeBSD.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
	[IPv6:2001:470:1f10:75::2])
	by mx1.freebsd.org (Postfix) with ESMTP id DD9FE8FC08;
	Tue,  8 May 2012 14:15:37 +0000 (UTC)
Received: from John-Baldwins-MacBook-Air.local
	(c-68-39-198-164.hsd1.de.comcast.net [68.39.198.164])
	by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3BE00B93B;
	Tue,  8 May 2012 10:15:37 -0400 (EDT)
Message-ID: <4FA92A88.2030000@FreeBSD.org>
Date: Tue, 08 May 2012 10:15:36 -0400
From: John Baldwin <jhb@FreeBSD.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
	rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: Andriy Gapon <avg@FreeBSD.org>
References: <4F8999D2.1080902@FreeBSD.org> <201205070953.04032.jhb@freebsd.org>
	<4FA7E069.8020208@FreeBSD.org> <201205071343.45955.jhb@freebsd.org>
	<4FA8C7E3.8070006@FreeBSD.org>
In-Reply-To: <4FA8C7E3.8070006@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(bigwig.baldwin.cx); Tue, 08 May 2012 10:15:37 -0400 (EDT)
Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 14:15:38 -0000

On 5/8/12 3:14 AM, Andriy Gapon wrote:
> on 07/05/2012 20:43 John Baldwin said the following:
>> On Monday, May 07, 2012 10:47:05 am Andriy Gapon wrote:
>>> on 07/05/2012 16:53 John Baldwin said the following:
> [snip]
>>> What do you think about the -LOCORE- change that Bruce inspired?
>>
>> In general I think this looks good.  I have only one suggestion.  In other
>> code (e.g. the genassym constants in the kernel) where we define constants
>> for field offsets, we make the constant be the uppercase name of the field
>> itself (e.g. TD_PCB for offsetof(struct thread, td_pcb)).  I would rather
>> do that here as well.  In this case the field names do not have a prefix,
>> but let's just use a BA_ prefix for members of 'bootargs'.  BI_SIZE is
>> already correct, but this would mean renaming HT_OFF to BA_HOWTO, BF_OFF to
>> BA_BOOTFLAGS, and BI_OFF to BA_BOOTINFO.
> 
> OK, doing this.
> 
>> I think you can probably leave BA_SIZE as-is.
> 
> I see that i386 genassym has a few different styles for sizeof constants:
> ABBRSIZE
> FULL_NAME_SIZE
> ABBR_SIZEOF
> 
> FULL_NAME_SIZE looked the most appealing to me (and seems to be the most
> common), so I decided to change BA_SIZE to BOOTARGS_SIZE.
> I hope that this makes sense and I am not starting a bikeshed :-)

Yeah, given the inconsistency in sizeof() constants in genassym.c, just
about anything is fine, which is why I hesitated to suggest any change.
 BOOTARGS_SIZE is fine.  I probably slightly prefer that because it is
less ambiguous (in case the structure has a foo_size member such as
bi_size in bootinfo).

Bruce might even suggest adding a ba_ prefix to all the members of
struct bootargs btw.  I would not be opposed, but you've already done
a fair bit of work on this patch.

-- 
John Baldwin

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 14:40:00 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B3564106566B
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 14:40:00 +0000 (UTC)
	(envelope-from freebsd@grem.de)
Received: from mail.grem.de (outcast.grem.de [213.239.217.27])
	by mx1.freebsd.org (Postfix) with SMTP id D767C8FC15
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 14:39:59 +0000 (UTC)
Received: (qmail 92372 invoked by uid 89); 8 May 2012 14:33:16 -0000
Received: from unknown (HELO ?172.20.10.3?) (mg@grem.de@109.43.0.73)
	by mail.grem.de with ESMTPA; 8 May 2012 14:33:16 -0000
From: Michael Gmelin <freebsd@grem.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Date: Tue, 8 May 2012 16:33:14 +0200
Message-Id: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
To: freebsd-fs@freebsd.org
Mime-Version: 1.0 (Apple Message framework v1084)
X-Mailer: Apple Mail (2.1084)
Subject: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 14:40:00 -0000

Hello,

I know I'm not the first one to ask this, but I couldn't find a =
definitive answers in previous threads.

I'm running a FreeBSD 9.0 RELEASE-p1 amd64 system, 8 x 1TB SATA2 drives =
(not SAS) and an LSI SAS 9211 controller in IT mode (HBAs, da0-da7). =
Zpool version 28, raidz2 container. Machine has 4GB of RAM, therefore =
ZFS prefetch is disabled. No manual tuning of ZFS options. Pool contains =
about 1TB of data right now (so about 25% full). In normal operations =
the pool shows excellent performance. Yesterday I had to replace a =
drive, so resilvering started. The resilver process took about 15 hours =
- which seems a little bit slow to me, but whatever - what really struck =
me was that during resilvering the pool performance got really bad. Read =
performance was acceptable, but write performance got down to 500kb/s =
(for almost all of the 15 hours). After resilvering finished, system =
performance returned to normal.

Fortunately this is a backup server and no full backups were scheduled, =
so no drama, but I really don't want to have to replace a drive in a =
database (or other high IO) server this way (I would have been forced to =
offline the drive somehow and migrate data to another server).

So the question is, is there anything I can do to improve the situation? =
Is this because of memory constraints? Are there any other knobs to =
adjust? As far as I know zfs_resilver_delay can't be changed in FreeBSD =
yet.

I have more drives around, so I could replace another one in the server, =
just to replicate the exact situation.

Cheers,
Michael

Disk layout:

daXp1128 boot
daXp2 16G frebsd-swap
daXp3 915G freebsd-zfs


Zpool status during resilvering:

[root@backup /tmp]# zpool status -v
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool =
will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Mon May  7 20:18:34 2012
    249G scanned out of 908G at 18.2M/s, 10h17m to go
    31.2G resilvered, 27.46% done
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            replacing-0             REMOVED      0     0     0
              15364271088212071398  REMOVED      0     0     0  was
/dev/da0p3/old
              da0p3                 ONLINE       0     0     0
(resilvering)
            da1p3                   ONLINE       0     0     0
            da2p3                   ONLINE       0     0     0
            da3p3                   ONLINE       0     0     0
            da4p3                   ONLINE       0     0     0
            da5p3                   ONLINE       0     0     0
            da6p3                   ONLINE       0     0     0
            da7p3                   ONLINE       0     0     0

errors: No known data errors

Zpool status later in the process:
root@backup /tmp]# zpool status
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool =
will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Mon May  7 20:18:34 2012
    833G scanned out of 908G at 19.1M/s, 1h7m to go
    104G resilvered, 91.70% done
config:

        NAME                        STATE     READ WRITE CKSUM
        tank                        DEGRADED     0     0     0
          raidz2-0                  DEGRADED     0     0     0
            replacing-0             REMOVED      0     0     0
              15364271088212071398  REMOVED      0     0     0  was
/dev/da0p3/old
              da0p3                 ONLINE       0     0     0
(resilvering)
            da1p3                   ONLINE       0     0     0
            da2p3                   ONLINE       0     0     0
            da3p3                   ONLINE       0     0     0
            da4p3                   ONLINE       0     0     0
            da5p3                   ONLINE       0     0     0
            da6p3                   ONLINE       0     0     0
            da7p3                   ONLINE       0     0     0

errors: No known data errors


Zpool status after resilvering finished:
root@backup /]# zpool status
  pool: tank
 state: ONLINE
 scan: resilvered 113G in 14h54m with 0 errors on Tue May  8 11:13:31 =
2012
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz2-0  ONLINE       0     0     0
            da0p3   ONLINE       0     0     0
            da1p3   ONLINE       0     0     0
            da2p3   ONLINE       0     0     0
            da3p3   ONLINE       0     0     0
            da4p3   ONLINE       0     0     0
            da5p3   ONLINE       0     0     0
            da6p3   ONLINE       0     0     0
            da7p3   ONLINE       0     0     0

errors: No known data errors


From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 14:58:34 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id E25B1106564A
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 14:58:34 +0000 (UTC)
	(envelope-from tevans.uk@googlemail.com)
Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com
	[209.85.216.47])
	by mx1.freebsd.org (Postfix) with ESMTP id 9A7098FC16
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 14:58:34 +0000 (UTC)
Received: by qabg1 with SMTP id g1so629457qab.13
	for <freebsd-fs@freebsd.org>; Tue, 08 May 2012 07:58:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=googlemail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=M/KJKOHzKPTwW9o7tCJov0qBFeKC8BbkbOk8z7sGi7Y=;
	b=tIVF1vpw84Ehh72mcbO7kjPx5psF5Xf0Aen7A09jzVUyPwtYtckW79T9qCKsod7WTn
	6Pq5gYfwtwYR8PsKetnWo9mvKToRU8cqK316+CLHzFWWezIJcRZESWdjeeHsYaypC2ek
	jUjysewnz/fQYhydbta8EoHmL4+CXdSOUEY6aO/Ya7F0ua+Es1EVPgUrf3yuyJamcW1O
	eXT8J54EUDubJxytYCgpkxy9n2mo1qwCssfV5yxUX8iaHcWewi/4OEldSlVd0kJe/FbA
	WJmlHFLDcE+N7vx+ndcAqLZrO/ElhJLyO8w0+vgtN9SrJQ/T4NNyh/DrCbHlmIi4NV8Z
	XsGQ==
MIME-Version: 1.0
Received: by 10.220.150.12 with SMTP id w12mr7919764vcv.39.1336489113763; Tue,
	08 May 2012 07:58:33 -0700 (PDT)
Received: by 10.52.28.240 with HTTP; Tue, 8 May 2012 07:58:33 -0700 (PDT)
In-Reply-To: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
Date: Tue, 8 May 2012 15:58:33 +0100
Message-ID: <CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
From: Tom Evans <tevans.uk@googlemail.com>
To: Michael Gmelin <freebsd@grem.de>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 14:58:35 -0000

On Tue, May 8, 2012 at 3:33 PM, Michael Gmelin <freebsd@grem.de> wrote:
> So the question is, is there anything I can do to improve the situation?
> Is this because of memory constraints? Are there any other knobs to
> adjust? As far as I know zfs_resilver_delay can't be changed in FreeBSD yet.
>
> I have more drives around, so I could replace another one in the server,
> just to replicate the exact situation.
>

In general, raidz is pretty fast, but when it's resilvering it is just
too busy. The first thing I would do to speed up writes is to add a
log device, preferably a SSD. Having a log device will allow the pool
to buffer writes to the pool much more effectively than normally
during a resilver.
Having lots of small writes will kill read speed during the resilver,
which is the critical thing.

If your workload would benefit, you could split the SSD down the
middle, use half for a log device, and half for a cache device to
accelerate reads.

I've never tried using a regular disk as a log device, I wonder if
that would speed up resilvering?

Cheers

Tom

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 17:23:57 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 966B1106564A
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 17:23:57 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from mail-qa0-f49.google.com (mail-qa0-f49.google.com
	[209.85.216.49])
	by mx1.freebsd.org (Postfix) with ESMTP id 550B08FC17
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 17:23:57 +0000 (UTC)
Received: by qabj40 with SMTP id j40so927946qab.15
	for <freebsd-fs@freebsd.org>; Tue, 08 May 2012 10:23:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:date:message-id:subject:from:to:content-type;
	bh=Izgdx5+DGlDSot6hV9UlqxX4rvEXJABTL61JDjXdtq0=;
	b=CUwtGxtva+g1VBaKZ0vWJusapIcwBeKXi/q7lKgeThB+EM+lm6gy147jZUIjSbDlR4
	RXNv6GBvEQZKDV61Q5k0yc/D2f+PRzGa2i/1bJIRLoARFzoOln1beaMtR8Iqrf1nNQpy
	qXNucSfhEpLDc8QV+9tBydXppOs4/21Te5bXPcYki0Mbh27zrtWmy9CGHfb/7+fMAfGF
	zVK8wvo/XKcUDjCkDXW+bjzpi0KatsbtWc0OhvIVqxrlJYRsKZLlAZ+1HAB/oCLOaKnh
	Uyv7kE0HBG8zj9yIAkQnXVCg//gyxOIm7gUwcM577Iw9Se5sDrryOv2DJMhwcr0Rlrs+
	qtVA==
MIME-Version: 1.0
Received: by 10.224.109.65 with SMTP id i1mr26705140qap.39.1336497836185; Tue,
	08 May 2012 10:23:56 -0700 (PDT)
Received: by 10.229.224.147 with HTTP; Tue, 8 May 2012 10:23:56 -0700 (PDT)
Date: Tue, 8 May 2012 10:23:56 -0700
Message-ID: <CAOjFWZ5OVG-ByhS_NDkF2-gbjUivGBxqLZZQaqMuLS-Q5ivDqA@mail.gmail.com>
From: Freddie Cash <fjwcash@gmail.com>
To: FreeBSD Filesystems <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Subject: Broken ZFS filesystem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 17:23:57 -0000

I have an interesting issue with one single ZFS filesystem in a pool.
All the other filesystems are fine, and can be mounted, snapshoted,
destroyed, etc.  But this one filesystem, if I try to do any operation
on it (zfs mount, zfs snapshot, zfs destroy, zfs set <anything>), it
spins the system until all RAM is used up (wired), and then hangs the
box.  The zfs process sits in tx -> tx_sync_done_cv state until the
box locks up.  CTRL+T of the process only ever shows this:
    load: 0.46  cmd: zfs 3115 [tx->tx_sync_done_cv)] 36.63r 0.00u 0.00s 0% 2440k

Anyone come across anything similar?  And found a way to fix it, or to
destroy the filesystem?  Any suggestions on how to go about debugging
this?  Any magical zdb commands to use?

The filesystem only has 5 MB of data in it (log files), compressed via
LZJB for a compressratio of ~6x.  There are no snapshots for this
filesystem.

Dedupe is enabled on the pool and all filesystems.

System is running 64-bit 9-RELEASE:
FreeBSD alphadrive.sd73.bc.ca 9.0-RELEASE FreeBSD 9.0-RELEASE #0
r229803: Sun Jan  8 00:43:00 PST 2012
root@alphadrive.sd73.bc.ca:/usr/obj/usr/src/sys/ZFSHOST90  amd64

Hardware is fairly generic:
  - SuperMicro H8DGi-F motherboard
  - AMD Opteron 6128 CPU (8 cores)
  - 24 GB of DDR3 RAM
  - 3x SuperMicro AOC-USAS-L8i SATA controllers
  - 24x harddrives ranging from 500 GB to 2.0 TB (6 of each kind in
raidz2 vdevs)
  - 64 GB SSD partitioned for OS, swap, with 32 GB for L2ARC

Filesystem properties:
# zfs get all storage/logs/rsync
NAME                PROPERTY              VALUE                  SOURCE
storage/logs/rsync  type                  filesystem             -
storage/logs/rsync  creation              Tue May 10  9:55 2011  -
storage/logs/rsync  used                  5.48M                  -
storage/logs/rsync  available             4.61T                  -
storage/logs/rsync  referenced            5.48M                  -
storage/logs/rsync  compressratio         5.93x                  -
storage/logs/rsync  mounted               no                     -
storage/logs/rsync  quota                 none                   default
storage/logs/rsync  reservation           none                   default
storage/logs/rsync  recordsize            128K                   default
storage/logs/rsync  mountpoint            /var/log/rsync         local
storage/logs/rsync  sharenfs              off                    default
storage/logs/rsync  checksum              sha256
inherited from storage
storage/logs/rsync  compression           lzjb
inherited from storage
storage/logs/rsync  atime                 off
inherited from storage
storage/logs/rsync  devices               on                     default
storage/logs/rsync  exec                  on                     default
storage/logs/rsync  setuid                on                     default
storage/logs/rsync  readonly              off                    default
storage/logs/rsync  jailed                off                    default
storage/logs/rsync  snapdir               visible
inherited from storage
storage/logs/rsync  aclmode               discard                default
storage/logs/rsync  aclinherit            restricted             default
storage/logs/rsync  canmount              on                     default
storage/logs/rsync  xattr                 on                     default
storage/logs/rsync  copies                1                      default
storage/logs/rsync  version               5                      -
storage/logs/rsync  utf8only              off                    -
storage/logs/rsync  normalization         none                   -
storage/logs/rsync  casesensitivity       sensitive              -
storage/logs/rsync  vscan                 off                    default
storage/logs/rsync  nbmand                off                    default
storage/logs/rsync  sharesmb              off                    default
storage/logs/rsync  refquota              none                   default
storage/logs/rsync  refreservation        none                   default
storage/logs/rsync  primarycache          all
inherited from storage
storage/logs/rsync  secondarycache        metadata
inherited from storage
storage/logs/rsync  usedbysnapshots       0                      -
storage/logs/rsync  usedbydataset         5.48M                  -
storage/logs/rsync  usedbychildren        0                      -
storage/logs/rsync  usedbyrefreservation  0                      -
storage/logs/rsync  logbias               latency                default
storage/logs/rsync  dedup                 sha256
inherited from storage
storage/logs/rsync  mlslabel                                     -
storage/logs/rsync  sync                  standard               default
storage/logs/rsync  refcompressratio      5.93x

-- 
Freddie Cash
fjwcash@gmail.com

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 18:34:54 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 18DAD106566B;
	Tue,  8 May 2012 18:34:54 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 01BA08FC14;
	Tue,  8 May 2012 18:34:52 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA17715;
	Tue, 08 May 2012 21:34:50 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
	by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1SRpFJ-000KeQ-MO; Tue, 08 May 2012 21:34:49 +0300
Message-ID: <4FA96747.3060106@FreeBSD.org>
Date: Tue, 08 May 2012 21:34:47 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@FreeBSD.org>
References: <4F8999D2.1080902@FreeBSD.org> <201205070953.04032.jhb@freebsd.org>
	<4FA7E069.8020208@FreeBSD.org> <201205071343.45955.jhb@freebsd.org>
	<4FA8C7E3.8070006@FreeBSD.org> <4FA92A88.2030000@FreeBSD.org>
In-Reply-To: <4FA92A88.2030000@FreeBSD.org>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 18:34:54 -0000

on 08/05/2012 17:15 John Baldwin said the following:
> Bruce might even suggest adding a ba_ prefix to all the members of
> struct bootargs btw.  I would not be opposed, but you've already done
> a fair bit of work on this patch.

Thank you for sparing me :-)
So I hope to get busy committing this stuff soon.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 19:48:51 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 266B51065672
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 19:48:51 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com
	[209.85.216.47])
	by mx1.freebsd.org (Postfix) with ESMTP id CFE7B8FC1C
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 19:48:50 +0000 (UTC)
Received: by qabg1 with SMTP id g1so1039569qab.13
	for <freebsd-fs@freebsd.org>; Tue, 08 May 2012 12:48:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:content-type:content-transfer-encoding;
	bh=KUrRoDLZSrl13CIwO7rk7VD7Z6w0lDhdLW1hcxxlLk8=;
	b=FjxacbXySHdP+5tvSEk0m5EdZj2JW7HXwsrP7MLyQlwCZxJTjVSW8Xmg29wvFEQdHz
	AK68g54ecPunp7FKALlhASNXHFGdADzFjifoXoU+65YuLIG+zR7jO1nNh5gczMeHqMiM
	QPSCyxvLf/Sqm3EsnlFjASdA5xBdoIAuuV6m4eMaUz+T1GbTktMNSx88H31VadjH8JAa
	8HCd8WwH7ruBYT1Pm4C/s+JVwBT4FF5YGzils6PZ4P9B6doBBd8bei/rN3FSIKivOZZi
	jbuKzGihV/S52wWJXtEP07FGi6Hr2E/Q7/U6MfZLXjyLIoVvAChw/Ay0UiikyRYkCQzo
	IsYw==
MIME-Version: 1.0
Received: by 10.224.73.1 with SMTP id o1mr619844qaj.43.1336506529316; Tue, 08
	May 2012 12:48:49 -0700 (PDT)
Received: by 10.229.224.147 with HTTP; Tue, 8 May 2012 12:48:49 -0700 (PDT)
In-Reply-To: <CAOjFWZ5OVG-ByhS_NDkF2-gbjUivGBxqLZZQaqMuLS-Q5ivDqA@mail.gmail.com>
References: <CAOjFWZ5OVG-ByhS_NDkF2-gbjUivGBxqLZZQaqMuLS-Q5ivDqA@mail.gmail.com>
Date: Tue, 8 May 2012 12:48:49 -0700
Message-ID: <CAOjFWZ4yvv3ffRXBKQec+nCjKfoWo6z2K8n+ARFKdaTgF62z_Q@mail.gmail.com>
From: Freddie Cash <fjwcash@gmail.com>
To: FreeBSD Filesystems <freebsd-fs@freebsd.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: Broken ZFS filesystem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 19:48:51 -0000

On Tue, May 8, 2012 at 10:23 AM, Freddie Cash <fjwcash@gmail.com> wrote:
> I have an interesting issue with one single ZFS filesystem in a pool.
> All the other filesystems are fine, and can be mounted, snapshoted,
> destroyed, etc. =C2=A0But this one filesystem, if I try to do any operati=
on
> on it (zfs mount, zfs snapshot, zfs destroy, zfs set <anything>), it
> spins the system until all RAM is used up (wired), and then hangs the
> box. =C2=A0The zfs process sits in tx -> tx_sync_done_cv state until the
> box locks up. =C2=A0CTRL+T of the process only ever shows this:
> =C2=A0 =C2=A0load: 0.46 =C2=A0cmd: zfs 3115 [tx->tx_sync_done_cv)] 36.63r=
 0.00u 0.00s 0% 2440k
>
> Anyone come across anything similar? =C2=A0And found a way to fix it, or =
to
> destroy the filesystem? =C2=A0Any suggestions on how to go about debuggin=
g
> this? =C2=A0Any magical zdb commands to use?
>
> The filesystem only has 5 MB of data in it (log files), compressed via
> LZJB for a compressratio of ~6x. =C2=A0There are no snapshots for this
> filesystem.
>
> Dedupe is enabled on the pool and all filesystems.
After more fiddling, testing, and experimenting, it all came down to
not enough RAM in the box to mount the 5 MB filesystem.  After
installing an extra 8 GB of RAM (32 GB total), everything mounted
correctly.  Took 27 GB of wired kernel memory (guessing ARC space) to
do it.

Unmount, mount, export, import, change properties all completed
successfully.  And the box is running correctly with 24 GB of RAM
again.

We'll be ordering more RAM for our ZFS boxes, now.  :)

--=20
Freddie Cash
fjwcash@gmail.com

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 20:02:32 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DE7931065673
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 20:02:32 +0000 (UTC)
	(envelope-from freebsd@grem.de)
Received: from mail.grem.de (outcast.grem.de [213.239.217.27])
	by mx1.freebsd.org (Postfix) with SMTP id 423FC8FC14
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 20:02:31 +0000 (UTC)
Received: (qmail 96300 invoked by uid 89); 8 May 2012 20:02:29 -0000
Received: from unknown (HELO ?192.168.250.164?) (mg@grem.de@80.137.83.22)
	by mail.grem.de with ESMTPA; 8 May 2012 20:02:29 -0000
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Michael Gmelin <freebsd@grem.de>
In-Reply-To: <CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
Date: Tue, 8 May 2012 22:02:29 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.1084)
Cc: Tom Evans <tevans.uk@googlemail.com>
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 20:02:32 -0000

On May 8, 2012, at 16:58, Tom Evans wrote:

> On Tue, May 8, 2012 at 3:33 PM, Michael Gmelin <freebsd@grem.de> =
wrote:
>> So the question is, is there anything I can do to improve the =
situation?
>> Is this because of memory constraints? Are there any other knobs to
>> adjust? As far as I know zfs_resilver_delay can't be changed in =
FreeBSD yet.
>>=20
>> I have more drives around, so I could replace another one in the =
server,
>> just to replicate the exact situation.
>>=20
>=20
> In general, raidz is pretty fast, but when it's resilvering it is just
> too busy. The first thing I would do to speed up writes is to add a
> log device, preferably a SSD. Having a log device will allow the pool
> to buffer writes to the pool much more effectively than normally
> during a resilver.
> Having lots of small writes will kill read speed during the resilver,
> which is the critical thing.
>=20
> If your workload would benefit, you could split the SSD down the
> middle, use half for a log device, and half for a cache device to
> accelerate reads.
>=20
> I've never tried using a regular disk as a log device, I wonder if
> that would speed up resilvering?
>=20
> Cheers
>=20
> Tom

Thanks for your constructive feedback. It would be interesting to see if =
adding an SSD could actually help in this case (it definitely would =
benefit the machine also during normal operation). Unfortunately it's =
not an option (the server is maxed out, there is simply no room to add a =
log device at the moment).

The general question remains - is there a way to make ZFS perform better =
during resilvering - has anybody experience tuning zfs_resilver_delay on =
Solaris and if this makes a difference (the variable is in the FreeBSD =
source code, but I couldn't find a way to change without touching the =
source)? - or is there something I missed that's specific about my =
setup. Especially in configurations using raidz2 and raidz3, that can =
withstand the loss of 2 or even 3 drives, having a longer resilver =
period shouldn't be an issue, as long as system performance is no =
degraded - or only degraded to a certain degree (I could see up to 50% =
more or less tolerable, in my case read performace was OKish, but write =
performance was reduced by more than 90%, so the machine was almost =
unusable).

Do you think it would make sense to try to play with zfs_resilver_delay =
directly in the ZFS kernel module?=20

(We have about 20 servers that could run ZFS around here, which =
currently run various combinations of UFS2+SU (no SUJ, since snapshots =
are broken currently), either on hardware RAID1 or some gmirror setup. I =
would like to standardize these setups to use ZFS, but I can't add =
logging devices to all of the for obvious reasons.)

I somehow feel that simulating this in a virtual machine is probably =
pointless :)

Cheers,
Michael


From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 21:31:30 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9EAD21065670
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 21:31:30 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id 558D28FC16
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 21:31:30 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id
	q48LVS8W028606; Tue, 8 May 2012 16:31:29 -0500 (CDT)
Date: Tue, 8 May 2012 16:31:28 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Michael Gmelin <freebsd@grem.de>
In-Reply-To: <180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
Message-ID: <alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
	<180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Tue, 08 May 2012 16:31:29 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 21:31:30 -0000

On Tue, 8 May 2012, Michael Gmelin wrote:
>
> Do you think it would make sense to try to play with zfs_resilver_delay directly in the ZFS kernel module?

This may be the wrong approach if the issue is really that there are 
too many I/Os queued for the device.  Finding a tunable which reduces 
the maximum number of I/Os queued for a disk device may help reduce 
write latencies by limiting the backlog.

On my Solaris 10 system, I accomplished this via a tunable in 
/etc/system:
set zfs:zfs_vdev_max_pending = 5

What is the equivalent for FreeBSD?

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 21:33:24 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id F1684106564A
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 21:33:24 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com
	[209.85.216.47])
	by mx1.freebsd.org (Postfix) with ESMTP id A8D5C8FC08
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 21:33:24 +0000 (UTC)
Received: by qabg1 with SMTP id g1so1160854qab.13
	for <freebsd-fs@freebsd.org>; Tue, 08 May 2012 14:33:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	bh=5k1em8eZqrWYkKiV39ucmQEHlrRO1LxzE/7T00pGiBI=;
	b=GWJEM3ebZfcZOqdP0xJnRmOVMp//i82WkU7ETSgvh6dybgM4uhMwaz9q7mveybRsOG
	X/4t7AX0pYDdIefM6h8e+uwIj36Io+CwSAab5Rnqtiskk8QmWN1fxet3oSupXXj2+pur
	zi/kxxwDoemBYKa2AYMBJwFoX2Rh5f5F/4jgqi3Vhh1ZXSgftxzrTybY0aNikCD1s2Kt
	c2RIRlasu6m6KS93vCmIkQ7S39Vqhk+RuamhUmKPdR8Dyb97ZR38HUHKvANioTjB0cuU
	rkBc7ZpZPBOnhzKwI/tfHn9fEUpWC+xTrA43FZ4if/5wEHppp2DMh8ROq44XopDjHL/F
	nK5g==
MIME-Version: 1.0
Received: by 10.224.178.9 with SMTP id bk9mr932141qab.98.1336512804006; Tue,
	08 May 2012 14:33:24 -0700 (PDT)
Received: by 10.229.224.147 with HTTP; Tue, 8 May 2012 14:33:23 -0700 (PDT)
In-Reply-To: <alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
	<180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
	<alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
Date: Tue, 8 May 2012 14:33:23 -0700
Message-ID: <CAOjFWZ7ik_sUmUaw4im729dc-2Toq2j_z_oxiqUpzc4x_TOujQ@mail.gmail.com>
From: Freddie Cash <fjwcash@gmail.com>
To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, Michael Gmelin <freebsd@grem.de>
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 21:33:25 -0000

On Tue, May 8, 2012 at 2:31 PM, Bob Friesenhahn
<bfriesen@simple.dallas.tx.us> wrote:
> On Tue, 8 May 2012, Michael Gmelin wrote:
>>
>> Do you think it would make sense to try to play with zfs_resilver_delay
>> directly in the ZFS kernel module?
>
> This may be the wrong approach if the issue is really that there are too
> many I/Os queued for the device. =C2=A0Finding a tunable which reduces th=
e
> maximum number of I/Os queued for a disk device may help reduce write
> latencies by limiting the backlog.
>
> On my Solaris 10 system, I accomplished this via a tunable in /etc/system=
:
> set zfs:zfs_vdev_max_pending =3D 5
>
> What is the equivalent for FreeBSD?

Setting vfs.zfs.vdev_max_pending=3D"4" in /boot/loader.conf (or whatever
value you want).  The default is 10.

> Bob
> --
> Bob Friesenhahn
> bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen=
/
> GraphicsMagick Maintainer, =C2=A0 =C2=A0http://www.GraphicsMagick.org/
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


--=20
Freddie Cash
fjwcash@gmail.com

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 22:06:27 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id EC82D106566B
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:06:27 +0000 (UTC)
	(envelope-from artemb@gmail.com)
Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com
	[209.85.215.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 643228FC0C
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:06:27 +0000 (UTC)
Received: by lagv3 with SMTP id v3so6321915lag.13
	for <freebsd-fs@freebsd.org>; Tue, 08 May 2012 15:06:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:sender:in-reply-to:references:date
	:x-google-sender-auth:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=w1F8BTxAYJxw69oxrPuziXBpWnSPa3wiBEW78Qs3Bso=;
	b=Y45u1wrfOPcqT7rIXL4OkPDldQBVzAHLSc+SMULWEx7akWdLk5+EznUsYoF3BVV36l
	Jyr/D2LFCNk+OpuWdjTNFY+N9LMa/CxHKUo68BzwUHR0E8yYpLyXjImxuTsZ+MK/RvVB
	3RgW/sPKs4odTAeCv34xLYda5xHwulcqvoIaHnUQYxYvq/J3Dze8dA2eaWWWMqpYlEs0
	RLKiVERTlgGWheyKj9GcWYDu7ty9j9vah9WBiZNkq8P/CGpa6fZMgyNlkZ3KoOo/SfPs
	3W5hrtFNUsSYww98/zR8L4B2eAK91UGSDlKz6a3m5psHaBZYR6XbmmEiMm5vDM51j+UF
	C8Vw==
MIME-Version: 1.0
Received: by 10.112.44.129 with SMTP id e1mr2406510lbm.44.1336514786032; Tue,
	08 May 2012 15:06:26 -0700 (PDT)
Sender: artemb@gmail.com
Received: by 10.112.2.5 with HTTP; Tue, 8 May 2012 15:06:25 -0700 (PDT)
In-Reply-To: <CAOjFWZ7ik_sUmUaw4im729dc-2Toq2j_z_oxiqUpzc4x_TOujQ@mail.gmail.com>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
	<180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
	<alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
	<CAOjFWZ7ik_sUmUaw4im729dc-2Toq2j_z_oxiqUpzc4x_TOujQ@mail.gmail.com>
Date: Tue, 8 May 2012 15:06:25 -0700
X-Google-Sender-Auth: MVSkhVa7_2YNJMurZZ8hTH_-SsQ
Message-ID: <CAFqOu6hxww5a1CLwYOZZcZNkJVhwH2eUXmtJKNwm6ohNmcqP0Q@mail.gmail.com>
From: Artem Belevich <art@freebsd.org>
To: Freddie Cash <fjwcash@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, Michael Gmelin <freebsd@grem.de>
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 22:06:28 -0000

On Tue, May 8, 2012 at 2:33 PM, Freddie Cash <fjwcash@gmail.com> wrote:
> On Tue, May 8, 2012 at 2:31 PM, Bob Friesenhahn
> <bfriesen@simple.dallas.tx.us> wrote:
>> On Tue, 8 May 2012, Michael Gmelin wrote:
>>>
>>> Do you think it would make sense to try to play with zfs_resilver_delay
>>> directly in the ZFS kernel module?
>>
>> This may be the wrong approach if the issue is really that there are too
>> many I/Os queued for the device. =A0Finding a tunable which reduces the
>> maximum number of I/Os queued for a disk device may help reduce write
>> latencies by limiting the backlog.
>>
>> On my Solaris 10 system, I accomplished this via a tunable in /etc/syste=
m:
>> set zfs:zfs_vdev_max_pending =3D 5
>>
>> What is the equivalent for FreeBSD?
>
> Setting vfs.zfs.vdev_max_pending=3D"4" in /boot/loader.conf (or whatever
> value you want). =A0The default is 10.

You may also want to look at vfs.zfs.scrub_limit sysctl. According to
description it's "Maximum scrub/resilver I/O queue" which sounds like
something that may help in this case.

--Artem

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 22:15:36 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7E2911065732
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:15:36 +0000 (UTC)
	(envelope-from freebsd@grem.de)
Received: from mail.grem.de (outcast.grem.de [213.239.217.27])
	by mx1.freebsd.org (Postfix) with SMTP id D14F28FC21
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:15:35 +0000 (UTC)
Received: (qmail 97884 invoked by uid 89); 8 May 2012 22:15:34 -0000
Received: from unknown (HELO ?192.168.250.164?) (mg@grem.de@80.137.83.22)
	by mail.grem.de with ESMTPA; 8 May 2012 22:15:34 -0000
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Michael Gmelin <freebsd@grem.de>
In-Reply-To: <CAFqOu6hxww5a1CLwYOZZcZNkJVhwH2eUXmtJKNwm6ohNmcqP0Q@mail.gmail.com>
Date: Wed, 9 May 2012 00:15:32 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <44759017-6FAC-4982-B382-CE17DED83262@grem.de>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
	<180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
	<alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
	<CAOjFWZ7ik_sUmUaw4im729dc-2Toq2j_z_oxiqUpzc4x_TOujQ@mail.gmail.com>
	<CAFqOu6hxww5a1CLwYOZZcZNkJVhwH2eUXmtJKNwm6ohNmcqP0Q@mail.gmail.com>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.1084)
Cc: 
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 22:15:36 -0000

On May 9, 2012, at 00:06, Artem Belevich wrote:

> On Tue, May 8, 2012 at 2:33 PM, Freddie Cash <fjwcash@gmail.com> =
wrote:
>> On Tue, May 8, 2012 at 2:31 PM, Bob Friesenhahn
>> <bfriesen@simple.dallas.tx.us> wrote:
>>> On Tue, 8 May 2012, Michael Gmelin wrote:
>>>>=20
>>>> Do you think it would make sense to try to play with =
zfs_resilver_delay
>>>> directly in the ZFS kernel module?
>>>=20
>>> This may be the wrong approach if the issue is really that there are =
too
>>> many I/Os queued for the device.  Finding a tunable which reduces =
the
>>> maximum number of I/Os queued for a disk device may help reduce =
write
>>> latencies by limiting the backlog.
>>>=20
>>> On my Solaris 10 system, I accomplished this via a tunable in =
/etc/system:
>>> set zfs:zfs_vdev_max_pending =3D 5
>>>=20
>>> What is the equivalent for FreeBSD?
>>=20
>> Setting vfs.zfs.vdev_max_pending=3D"4" in /boot/loader.conf (or =
whatever
>> value you want).  The default is 10.
>=20

Do you think this will actually make a difference. As far as I
understand my primary problem is not latency but throughput. Simple
example is dd if=3D/dev/zero of=3Dfilename bs=3D1m, which gave me =
500kb/s.
Latency might be an additional problem (or am I mislead and a shorter
queue would raise the processes chance to get data through?).

> You may also want to look at vfs.zfs.scrub_limit sysctl. According to
> description it's "Maximum scrub/resilver I/O queue" which sounds like
> something that may help in this case.
>=20
> --Artem

Very good point, thank you. I also found this entry in the FreeBSD
forums indicating that this might ease the pain (even though he's also
talking about scrub, not resilver, hopefully the tunable does both as
indicated in the comments):

http://forums.freebsd.org/showthread.php?t=3D31628

/* maximum scrub/resilver I/O queue per leaf vdev */ int
zfs_scrub_limit =3D 10;

TUNABLE_INT("vfs.zfs.scrub_limit", &zfs_scrub_limit);
SYSCTL_INT(_vfs_zfs, OID_AUTO, scrub_limit, CTLFLAG_RDTUN,
&zfs_scrub_limit, 0, "Maximum scrub/resilver I/O queue");   =20

I will try lowering the value zfs_scrub_limit to 6 in loader.conf
and replace the drive once more later this month.

--=20
Michael


From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 22:42:15 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 81B72106566C
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:42:15 +0000 (UTC)
	(envelope-from bfriesen@simple.dallas.tx.us)
Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74])
	by mx1.freebsd.org (Postfix) with ESMTP id 326548FC0C
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:42:15 +0000 (UTC)
Received: from freddy.simplesystems.org (freddy.simplesystems.org
	[65.66.246.65])
	by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id
	q48MgDL5028777; Tue, 8 May 2012 17:42:14 -0500 (CDT)
Date: Tue, 8 May 2012 17:42:13 -0500 (CDT)
From: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
X-X-Sender: bfriesen@freddy.simplesystems.org
To: Michael Gmelin <freebsd@grem.de>
In-Reply-To: <44759017-6FAC-4982-B382-CE17DED83262@grem.de>
Message-ID: <alpine.GSO.2.01.1205081732090.9406@freddy.simplesystems.org>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
	<180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
	<alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
	<CAOjFWZ7ik_sUmUaw4im729dc-2Toq2j_z_oxiqUpzc4x_TOujQ@mail.gmail.com>
	<CAFqOu6hxww5a1CLwYOZZcZNkJVhwH2eUXmtJKNwm6ohNmcqP0Q@mail.gmail.com>
	<44759017-6FAC-4982-B382-CE17DED83262@grem.de>
User-Agent: Alpine 2.01 (GSO 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
	(blade.simplesystems.org [65.66.246.90]);
	Tue, 08 May 2012 17:42:14 -0500 (CDT)
Cc: freebsd-fs@freebsd.org
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 22:42:15 -0000

On Wed, 9 May 2012, Michael Gmelin wrote:
>>>
>>> Setting vfs.zfs.vdev_max_pending="4" in /boot/loader.conf (or whatever
>>> value you want).  The default is 10.
>
> Do you think this will actually make a difference. As far as I
> understand my primary problem is not latency but throughput. Simple
> example is dd if=/dev/zero of=filename bs=1m, which gave me 500kb/s.
> Latency might be an additional problem (or am I mislead and a shorter
> queue would raise the processes chance to get data through?).

The effect may be observed in real-time on a running system.  Latency 
and throughput go hand in hand.  The 'dd' command is not threaded and 
is sequential.  It waits for the current I/O to return before it 
starts the next one.  If the wait is shorter (fewer pending requests 
in line), then throughput does increase. System total throughput 
(which includes the resilver operations) may not increase but the 
throughput observed by an individual waiter may increase.

The default for vdev_max_pending on Solaris was/is 32.  If FreeBSD 
uses a default of 10 then reducing from the default may be less 
dramatic.

Bob
-- 
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From owner-freebsd-fs@FreeBSD.ORG  Tue May  8 22:48:25 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B5E031065680
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:48:25 +0000 (UTC)
	(envelope-from freebsd@grem.de)
Received: from mail.grem.de (outcast.grem.de [213.239.217.27])
	by mx1.freebsd.org (Postfix) with SMTP id 1486F8FC16
	for <freebsd-fs@freebsd.org>; Tue,  8 May 2012 22:48:24 +0000 (UTC)
Received: (qmail 98385 invoked by uid 89); 8 May 2012 22:48:23 -0000
Received: from unknown (HELO ?192.168.250.164?) (mg@grem.de@80.137.83.22)
	by mail.grem.de with ESMTPA; 8 May 2012 22:48:23 -0000
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Michael Gmelin <freebsd@grem.de>
In-Reply-To: <alpine.GSO.2.01.1205081732090.9406@freddy.simplesystems.org>
Date: Wed, 9 May 2012 00:48:22 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <1CEFF50A-4CD5-4947-8A38-2EEAE3311E67@grem.de>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<CAFHbX1K0--P-Sh0QdLszEs0V1ocWoe6Jp_SY9H+VJd1AQw2XKA@mail.gmail.com>
	<180B72CE-B285-4702-B16D-0714AA07022C@grem.de>
	<alpine.GSO.2.01.1205081625470.9406@freddy.simplesystems.org>
	<CAOjFWZ7ik_sUmUaw4im729dc-2Toq2j_z_oxiqUpzc4x_TOujQ@mail.gmail.com>
	<CAFqOu6hxww5a1CLwYOZZcZNkJVhwH2eUXmtJKNwm6ohNmcqP0Q@mail.gmail.com>
	<44759017-6FAC-4982-B382-CE17DED83262@grem.de>
	<alpine.GSO.2.01.1205081732090.9406@freddy.simplesystems.org>
To: freebsd-fs@freebsd.org
X-Mailer: Apple Mail (2.1084)
Cc: 
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 22:48:25 -0000

On May 9, 2012, at 00:42, Bob Friesenhahn wrote:

> On Wed, 9 May 2012, Michael Gmelin wrote:
>>>>=20
>>>> Setting vfs.zfs.vdev_max_pending=3D"4" in /boot/loader.conf (or =
whatever
>>>> value you want).  The default is 10.
>>=20
>> Do you think this will actually make a difference. As far as I
>> understand my primary problem is not latency but throughput. Simple
>> example is dd if=3D/dev/zero of=3Dfilename bs=3D1m, which gave me =
500kb/s.
>> Latency might be an additional problem (or am I mislead and a shorter
>> queue would raise the processes chance to get data through?).
>=20
> The effect may be observed in real-time on a running system.  Latency =
and throughput go hand in hand.  The 'dd' command is not threaded and is =
sequential.  It waits for the current I/O to return before it starts the =
next one.  If the wait is shorter (fewer pending requests in line), then =
throughput does increase. System total throughput (which includes the =
resilver operations) may not increase but the throughput observed by an =
individual waiter may increase.
>=20
> The default for vdev_max_pending on Solaris was/is 32.  If FreeBSD =
uses a default of 10 then reducing from the default may be less =
dramatic.
>=20

That makes sense.

I will run more sophisticated I/O tests next time to get a more
complete picture.

--=20
Michael

> Bob
> --=20
> Bob Friesenhahn
> bfriesen@simple.dallas.tx.us, =
http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/


From owner-freebsd-fs@FreeBSD.ORG  Wed May  9 06:55:27 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id D70B8106566C
	for <freebsd-fs@freebsd.org>; Wed,  9 May 2012 06:55:27 +0000 (UTC)
	(envelope-from peter.maloney@brockmann-consult.de)
Received: from moutng.kundenserver.de (moutng.kundenserver.de
	[212.227.126.187])
	by mx1.freebsd.org (Postfix) with ESMTP id 809228FC0A
	for <freebsd-fs@freebsd.org>; Wed,  9 May 2012 06:55:27 +0000 (UTC)
Received: from [10.3.0.26] ([141.4.215.32])
	by mrelayeu.kundenserver.de (node=mreu3) with ESMTP (Nemesis)
	id 0M5mBh-1SD5NZ2JyL-00xmMP; Wed, 09 May 2012 08:55:20 +0200
Message-ID: <4FAA14D6.8060302@brockmann-consult.de>
Date: Wed, 09 May 2012 08:55:18 +0200
From: Peter Maloney <peter.maloney@brockmann-consult.de>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:11.0) Gecko/20120312 Thunderbird/11.0
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
In-Reply-To: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
X-Enigmail-Version: 1.4.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Provags-ID: V02:K0:QNorriCnYF1BJDt2p63oj2R2nLGavWxLJayIVvg37lG
	eGGpCELqNz8+vNe2YPkmYYri8iAA/HyCDlURhkWA5JFSMSg4QK
	+8mpxnpJ/qynYYo4e+RfkuqXtsdYJ8RlslT+GzOG5B7umFH/pF
	7x0+ck5l+zBF9+dj4cRMK9POoKfKQj+GOgVpY08wEhqQdgVUQ3
	Jp3JDMBDSUtwBlXUrh/wawGXjDL4fMk83v6r5Q8E9LhPQfJ9wm
	oatc8CrsSsFou7heYAh4yUY8vdkOtwTgKu9EbUTyR5Dkt0cmMZ
	tAms1K8xu6AQofehyjw0Lg5/fsoOLu6FuupzygZqGeWrgWuU/U
	Wv6Hy6unCRDv4cYHiJF6YCZWIfTs2GKJaH6gPq60r
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 May 2012 06:55:27 -0000

About the slow performance during resilver,

Are they consumer disks? If so, one guess is you have a bad disk. Check
by looking at load and ms per x on disks. If one is high and others are
low, then it's probably bad. If a single 'good' disk is bad, the whole
thing will run very slow. Bad consumer disks run very slow trying over
and over to read the not-yet-bad sectors where enterprise disks would
throw errors and fail.

My other guess is that this is because FreeBSD, unlike Linux and
Solaris, lacks IO scheduling. So there is no way for the zfs code to
truly put the resilver on lower priority than the regular production
applications. I've read that IO scheduling was developed for 8.2, but
never officially adopted. I would love to see it in FreeBSD... I use
"ionice" on Linux all the time (for copying, backups, zipping,
installing a a huge batch of packages [noticeable >300 MB], etc. while I
work on other things), so I miss it. IO scheduling on Solaris also helps
with dedup performance.

Does anyone know if there is a movement to add the IO scheduling code
into the base system?


On 05/08/2012 04:33 PM, Michael Gmelin wrote:
> Hello,
>
> I know I'm not the first one to ask this, but I couldn't find a definitive answers in previous threads.
>
> I'm running a FreeBSD 9.0 RELEASE-p1 amd64 system, 8 x 1TB SATA2 drives (not SAS) and an LSI SAS 9211 controller in IT mode (HBAs, da0-da7). Zpool version 28, raidz2 container. Machine has 4GB of RAM, therefore ZFS prefetch is disabled. No manual tuning of ZFS options. Pool contains about 1TB of data right now (so about 25% full). In normal operations the pool shows excellent performance. Yesterday I had to replace a drive, so resilvering started. The resilver process took about 15 hours - which seems a little bit slow to me, but whatever - what really struck me was that during resilvering the pool performance got really bad. Read performance was acceptable, but write performance got down to 500kb/s (for almost all of the 15 hours). After resilvering finished, system performance returned to normal.
>
> Fortunately this is a backup server and no full backups were scheduled, so no drama, but I really don't want to have to replace a drive in a database (or other high IO) server this way (I would have been forced to offline the drive somehow and migrate data to another server).
>
> So the question is, is there anything I can do to improve the situation? Is this because of memory constraints? Are there any other knobs to adjust? As far as I know zfs_resilver_delay can't be changed in FreeBSD yet.
>
> I have more drives around, so I could replace another one in the server, just to replicate the exact situation.
>
> Cheers,
> Michael
>
> Disk layout:
>
> daXp1128 boot
> daXp2 16G frebsd-swap
> daXp3 915G freebsd-zfs
>
>
> Zpool status during resilvering:
>
> [root@backup /tmp]# zpool status -v
>   pool: tank
>  state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool will
>         continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scan: resilver in progress since Mon May  7 20:18:34 2012
>     249G scanned out of 908G at 18.2M/s, 10h17m to go
>     31.2G resilvered, 27.46% done
> config:
>
>         NAME                        STATE     READ WRITE CKSUM
>         tank                        DEGRADED     0     0     0
>           raidz2-0                  DEGRADED     0     0     0
>             replacing-0             REMOVED      0     0     0
>               15364271088212071398  REMOVED      0     0     0  was
> /dev/da0p3/old
>               da0p3                 ONLINE       0     0     0
> (resilvering)
>             da1p3                   ONLINE       0     0     0
>             da2p3                   ONLINE       0     0     0
>             da3p3                   ONLINE       0     0     0
>             da4p3                   ONLINE       0     0     0
>             da5p3                   ONLINE       0     0     0
>             da6p3                   ONLINE       0     0     0
>             da7p3                   ONLINE       0     0     0
>
> errors: No known data errors
>
> Zpool status later in the process:
> root@backup /tmp]# zpool status
>   pool: tank
>  state: DEGRADED
> status: One or more devices is currently being resilvered.  The pool will
>         continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
>  scan: resilver in progress since Mon May  7 20:18:34 2012
>     833G scanned out of 908G at 19.1M/s, 1h7m to go
>     104G resilvered, 91.70% done
> config:
>
>         NAME                        STATE     READ WRITE CKSUM
>         tank                        DEGRADED     0     0     0
>           raidz2-0                  DEGRADED     0     0     0
>             replacing-0             REMOVED      0     0     0
>               15364271088212071398  REMOVED      0     0     0  was
> /dev/da0p3/old
>               da0p3                 ONLINE       0     0     0
> (resilvering)
>             da1p3                   ONLINE       0     0     0
>             da2p3                   ONLINE       0     0     0
>             da3p3                   ONLINE       0     0     0
>             da4p3                   ONLINE       0     0     0
>             da5p3                   ONLINE       0     0     0
>             da6p3                   ONLINE       0     0     0
>             da7p3                   ONLINE       0     0     0
>
> errors: No known data errors
>
>
> Zpool status after resilvering finished:
> root@backup /]# zpool status
>   pool: tank
>  state: ONLINE
>  scan: resilvered 113G in 14h54m with 0 errors on Tue May  8 11:13:31 2012
> config:
>
>         NAME        STATE     READ WRITE CKSUM
>         tank        ONLINE       0     0     0
>           raidz2-0  ONLINE       0     0     0
>             da0p3   ONLINE       0     0     0
>             da1p3   ONLINE       0     0     0
>             da2p3   ONLINE       0     0     0
>             da3p3   ONLINE       0     0     0
>             da4p3   ONLINE       0     0     0
>             da5p3   ONLINE       0     0     0
>             da6p3   ONLINE       0     0     0
>             da7p3   ONLINE       0     0     0
>
> errors: No known data errors
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"


-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------


From owner-freebsd-fs@FreeBSD.ORG  Wed May  9 14:05:09 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id ED825106564A
	for <freebsd-fs@freebsd.org>; Wed,  9 May 2012 14:05:09 +0000 (UTC)
	(envelope-from freebsd-fs@m.gmane.org)
Received: from plane.gmane.org (plane.gmane.org [80.91.229.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 93BC28FC15
	for <freebsd-fs@freebsd.org>; Wed,  9 May 2012 14:05:09 +0000 (UTC)
Received: from list by plane.gmane.org with local (Exim 4.69)
	(envelope-from <freebsd-fs@m.gmane.org>) id 1SS7Vm-00049f-Tf
	for freebsd-fs@freebsd.org; Wed, 09 May 2012 16:05:02 +0200
Received: from dyn1219-111.wlan.ic.ac.uk ([129.31.219.111])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Wed, 09 May 2012 16:05:02 +0200
Received: from johannes by dyn1219-111.wlan.ic.ac.uk with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-fs@freebsd.org>; Wed, 09 May 2012 16:05:02 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-fs@freebsd.org
From: Johannes Totz <johannes@jo-t.de>
Date: Wed, 09 May 2012 15:04:52 +0100
Lines: 141
Message-ID: <jodti4$dhv$1@dough.gmane.org>
References: <73F8D020-04F3-44B2-97D4-F08E3B253C32@grem.de>
	<4FAA14D6.8060302@brockmann-consult.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: dyn1219-111.wlan.ic.ac.uk
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
	rv:12.0) Gecko/20120428 Thunderbird/12.0.1
In-Reply-To: <4FAA14D6.8060302@brockmann-consult.de>
Subject: Re: ZFS resilvering strangles IO
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 May 2012 14:05:10 -0000

On 09/05/2012 07:55, Peter Maloney wrote:
> About the slow performance during resilver,
> 
> Are they consumer disks? If so, one guess is you have a bad disk. Check
> by looking at load and ms per x on disks. If one is high and others are
> low, then it's probably bad. If a single 'good' disk is bad, the whole
> thing will run very slow. Bad consumer disks run very slow trying over
> and over to read the not-yet-bad sectors where enterprise disks would
> throw errors and fail.
> 
> My other guess is that this is because FreeBSD, unlike Linux and
> Solaris, lacks IO scheduling. So there is no way for the zfs code to
> truly put the resilver on lower priority than the regular production
> applications. I've read that IO scheduling was developed for 8.2, but
> never officially adopted. I would love to see it in FreeBSD... I use
> "ionice" on Linux all the time (for copying, backups, zipping,
> installing a a huge batch of packages [noticeable >300 MB], etc. while I
> work on other things), so I miss it. IO scheduling on Solaris also helps
> with dedup performance.
> 
> Does anyone know if there is a movement to add the IO scheduling code
> into the base system?

There was a geom module for io scheduling: gsched(8)
But I've never used it and don't know what the state of it is...


> On 05/08/2012 04:33 PM, Michael Gmelin wrote:
>> Hello,
>>
>> I know I'm not the first one to ask this, but I couldn't find a definitive answers in previous threads.
>>
>> I'm running a FreeBSD 9.0 RELEASE-p1 amd64 system, 8 x 1TB SATA2 drives (not SAS) and an LSI SAS 9211 controller in IT mode (HBAs, da0-da7). Zpool version 28, raidz2 container. Machine has 4GB of RAM, therefore ZFS prefetch is disabled. No manual tuning of ZFS options. Pool contains about 1TB of data right now (so about 25% full). In normal operations the pool shows excellent performance. Yesterday I had to replace a drive, so resilvering started. The resilver process took about 15 hours - which seems a little bit slow to me, but whatever - what really struck me was that during resilvering the pool performance got really bad. Read performance was acceptable, but write performance got down to 500kb/s (for almost all of the 15 hours). After resilvering finished, system performance returned
>   to normal.
>>
>> Fortunately this is a backup server and no full backups were scheduled, so no drama, but I really don't want to have to replace a drive in a database (or other high IO) server this way (I would have been forced to offline the drive somehow and migrate data to another server).
>>
>> So the question is, is there anything I can do to improve the situation? Is this because of memory constraints? Are there any other knobs to adjust? As far as I know zfs_resilver_delay can't be changed in FreeBSD yet.
>>
>> I have more drives around, so I could replace another one in the server, just to replicate the exact situation.
>>
>> Cheers,
>> Michael
>>
>> Disk layout:
>>
>> daXp1128 boot
>> daXp2 16G frebsd-swap
>> daXp3 915G freebsd-zfs
>>
>>
>> Zpool status during resilvering:
>>
>> [root@backup /tmp]# zpool status -v
>>   pool: tank
>>  state: DEGRADED
>> status: One or more devices is currently being resilvered.  The pool will
>>         continue to function, possibly in a degraded state.
>> action: Wait for the resilver to complete.
>>  scan: resilver in progress since Mon May  7 20:18:34 2012
>>     249G scanned out of 908G at 18.2M/s, 10h17m to go
>>     31.2G resilvered, 27.46% done
>> config:
>>
>>         NAME                        STATE     READ WRITE CKSUM
>>         tank                        DEGRADED     0     0     0
>>           raidz2-0                  DEGRADED     0     0     0
>>             replacing-0             REMOVED      0     0     0
>>               15364271088212071398  REMOVED      0     0     0  was
>> /dev/da0p3/old
>>               da0p3                 ONLINE       0     0     0
>> (resilvering)
>>             da1p3                   ONLINE       0     0     0
>>             da2p3                   ONLINE       0     0     0
>>             da3p3                   ONLINE       0     0     0
>>             da4p3                   ONLINE       0     0     0
>>             da5p3                   ONLINE       0     0     0
>>             da6p3                   ONLINE       0     0     0
>>             da7p3                   ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>> Zpool status later in the process:
>> root@backup /tmp]# zpool status
>>   pool: tank
>>  state: DEGRADED
>> status: One or more devices is currently being resilvered.  The pool will
>>         continue to function, possibly in a degraded state.
>> action: Wait for the resilver to complete.
>>  scan: resilver in progress since Mon May  7 20:18:34 2012
>>     833G scanned out of 908G at 19.1M/s, 1h7m to go
>>     104G resilvered, 91.70% done
>> config:
>>
>>         NAME                        STATE     READ WRITE CKSUM
>>         tank                        DEGRADED     0     0     0
>>           raidz2-0                  DEGRADED     0     0     0
>>             replacing-0             REMOVED      0     0     0
>>               15364271088212071398  REMOVED      0     0     0  was
>> /dev/da0p3/old
>>               da0p3                 ONLINE       0     0     0
>> (resilvering)
>>             da1p3                   ONLINE       0     0     0
>>             da2p3                   ONLINE       0     0     0
>>             da3p3                   ONLINE       0     0     0
>>             da4p3                   ONLINE       0     0     0
>>             da5p3                   ONLINE       0     0     0
>>             da6p3                   ONLINE       0     0     0
>>             da7p3                   ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>>
>> Zpool status after resilvering finished:
>> root@backup /]# zpool status
>>   pool: tank
>>  state: ONLINE
>>  scan: resilvered 113G in 14h54m with 0 errors on Tue May  8 11:13:31 2012
>> config:
>>
>>         NAME        STATE     READ WRITE CKSUM
>>         tank        ONLINE       0     0     0
>>           raidz2-0  ONLINE       0     0     0
>>             da0p3   ONLINE       0     0     0
>>             da1p3   ONLINE       0     0     0
>>             da2p3   ONLINE       0     0     0
>>             da3p3   ONLINE       0     0     0
>>             da4p3   ONLINE       0     0     0
>>             da5p3   ONLINE       0     0     0
>>             da6p3   ONLINE       0     0     0
>>             da7p3   ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>> _______________________________________________
>> freebsd-fs@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> 
> 


From owner-freebsd-fs@FreeBSD.ORG  Wed May  9 22:04:14 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 34EF7106567A
	for <freebsd-fs@freebsd.org>; Wed,  9 May 2012 22:04:14 +0000 (UTC)
	(envelope-from lists@hurricane-ridge.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id E43EA8FC18
	for <freebsd-fs@freebsd.org>; Wed,  9 May 2012 22:04:13 +0000 (UTC)
Received: by vbmv11 with SMTP id v11so1136797vbm.13
	for <freebsd-fs@freebsd.org>; Wed, 09 May 2012 15:04:13 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:x-originating-ip:date:message-id:subject:from:to
	:content-type:x-gm-message-state;
	bh=UtyDRBuCnoOfD09bl97pkolHm7gwOd8JQ+XFWrXUEn0=;
	b=MbLiJGnJf6WC3LqeZCqoQ51GnBFV8fleeJLA6ghDl7CVrIZ7ENQnkifkQ54lQo6MCg
	QSwl8aorgC6Ny55d0HQhHxgl8qMgVHvQUMkHN7q4gsWf9CmSKmFdGBrbnW6wkjp7Fpjb
	YNatKNNzvU8ZMASEBLaT+rq3lY3thsd42Omv9KR+R1eFGWPwkcZ0YAk6pyuNF/bOIsf8
	aFVXt6KqCsZp65Z3TpiAoCjoFLB3Fiqqfcg2S/saAzKRW7DPcm8rJ88iUchBLyqfJCwz
	+pyJR3rvCudGYuRZjNmfaHa8Cn5v+XM6GgZXRPKjlwoKskdbDjiO48K6uqUJpFmMb+OK
	eZxA==
MIME-Version: 1.0
Received: by 10.52.100.67 with SMTP id ew3mr874556vdb.36.1336601053287; Wed,
	09 May 2012 15:04:13 -0700 (PDT)
Received: by 10.220.22.199 with HTTP; Wed, 9 May 2012 15:04:13 -0700 (PDT)
X-Originating-IP: [98.247.224.125]
Date: Wed, 9 May 2012 15:04:13 -0700
Message-ID: <CADUQDp9ytTTUqRvqzySBfugkqL56okEgZOOs_vvbKmOYi=mL0Q@mail.gmail.com>
From: Andrew Leonard <lists@hurricane-ridge.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQm2Oa11BZev8vdWQGPxusZARaciS3SEWCTiU0MysX7eFrG+nOhFdyizd2gHjKN0OR+jSr5m
Subject: Unable to set ACLs on ZFS file system over NFSv4?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 May 2012 22:04:14 -0000

I have a ZFS file system on which I can successfully manipulate ACLs
locally, but am unable to do so when it is mounted remotely using
NFSv4 on both FreeBSD and Linux (CentOS 5) clients.

The system in question is running 8-STABLE:

FreeBSD zfs07.example.com 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Nov 17
17:46:00 PST 2011
root@zfs07.example.com:/usr/obj/usr/src/sys/GENERIC  amd64

ACLs can be successfully manipulated locally; e.g. the following
returns no error and works as expected:

> setfacl -m g:group2:rwxpDaRWcs:fd:allow /tank01/ngs/test.dir

The file system is exported as follows in /etc/exports:

/tank01/ngs -sec=sys
V4: /tank01 -sec=sys

On the FreeBSD client, it is mounted using NFSv4, and behaves as
follows under the same user (sanitized to "user1", who is in
"group1"):

> whoami
user1
> groups
group1 [...]
> mount | grep /mnt
zfs07b:/ngs on /mnt (newnfs, nfsv4acls)
>  getfacl /mnt/test2.dir
# file: /mnt/test2.dir
# owner: user1
# group: group1
   group:group1:rwxpDdaARWcCo-:fd----:allow
            owner@:rwxp--aARWcCo-:------:allow
            group@:r-x---a-R-c---:------:allow
         everyone@:r-x---a-R-c---:------:allow
> setfacl -m g:group2:rwxpDaRWcs:fd:allow /mnt/test2.dir
setfacl: /mnt/test2.dir: acl_set_file() failed: Input/output error

In all other respects, ACLs appear to be honored over NFSv4 - the user
can access, create, modify and delete files as expected, and ACLs are
appropriately inherited - the ACLs just cannot be manipulated.

Linux client behavior is functionally identical:

> mount | grep /mnt
zfs07b:/ngs on /mnt type nfs4 (rw,addr=192.168.x.y)
> nfs4_setfacl -a A:gfd:group2:rwxaDdtnNcy test2.dir
Failed setxattr operation: Input/output error

Is this a misconfiguration on my part, a known limitation, or a bug?

More details:

> zfs get version tank01/ngs
NAME        PROPERTY  VALUE    SOURCE
tank01/ngs  version   5        -
> zpool get version tank01
NAME    PROPERTY  VALUE    SOURCE
tank01  version   28       default
> zfs get all tank01/ngs
NAME        PROPERTY              VALUE                  SOURCE
tank01/ngs  type                  filesystem             -
tank01/ngs  creation              Tue May  1 16:15 2012  -
tank01/ngs  used                  61.6G                  -
tank01/ngs  available             4.47T                  -
tank01/ngs  referenced            33.8G                  -
tank01/ngs  compressratio         4.23x                  -
tank01/ngs  mounted               yes                    -
tank01/ngs  quota                 none                   default
tank01/ngs  reservation           none                   default
tank01/ngs  recordsize            128K                   default
tank01/ngs  mountpoint            /tank01/ngs            default
tank01/ngs  sharenfs              off                    default
tank01/ngs  checksum              on                     default
tank01/ngs  compression           gzip                   local
tank01/ngs  atime                 on                     default
tank01/ngs  devices               on                     default
tank01/ngs  exec                  on                     default
tank01/ngs  setuid                off                    inherited from tank01
tank01/ngs  readonly              off                    default
tank01/ngs  jailed                off                    default
tank01/ngs  snapdir               hidden                 default
tank01/ngs  aclmode               passthrough            local
tank01/ngs  aclinherit            passthrough-x          local
tank01/ngs  canmount              on                     default
tank01/ngs  xattr                 off                    temporary
tank01/ngs  copies                1                      default
tank01/ngs  version               5                      -
tank01/ngs  utf8only              off                    -
tank01/ngs  normalization         none                   -
tank01/ngs  casesensitivity       sensitive              -
tank01/ngs  vscan                 off                    default
tank01/ngs  nbmand                off                    default
tank01/ngs  sharesmb              off                    default
tank01/ngs  refquota              none                   default
tank01/ngs  refreservation        none                   default
tank01/ngs  primarycache          all                    default
tank01/ngs  secondarycache        all                    default
tank01/ngs  usedbysnapshots       27.8G                  -
tank01/ngs  usedbydataset         33.8G                  -
tank01/ngs  usedbychildren        0                      -
tank01/ngs  usedbyrefreservation  0                      -
tank01/ngs  logbias               latency                default
tank01/ngs  dedup                 off                    default
tank01/ngs  mlslabel                                     -
tank01/ngs  sync                  standard               default
tank01/ngs  refcompressratio      4.14x                  -
> egrep 'nfs|zfs' /etc/rc.conf.local
nfscbd_enable="YES"
nfs_client_enable="YES"
nfsuserd_enable="YES"
nfsv4_server_enable="YES"
nfs_server_enable="YES"
zfs_enable="YES"

Thanks,
Andy

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 04:13:17 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 068F71065670;
	Thu, 10 May 2012 04:13:17 +0000 (UTC) (envelope-from mm@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id CF5908FC0A;
	Thu, 10 May 2012 04:13:16 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4A4DGnW018090;
	Thu, 10 May 2012 04:13:16 GMT (envelope-from mm@freefall.freebsd.org)
Received: (from mm@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4A4DGwa018086;
	Thu, 10 May 2012 04:13:16 GMT (envelope-from mm)
Date: Thu, 10 May 2012 04:13:16 GMT
Message-Id: <201205100413.q4A4DGwa018086@freefall.freebsd.org>
To: mm@FreeBSD.org, freebsd-fs@FreeBSD.org, mm@FreeBSD.org
From: mm@FreeBSD.org
Cc: 
Subject: Re: kern/167467: [zfs][patch] improve zdb(8) manpage and help.
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 04:13:17 -0000

Synopsis: [zfs][patch] improve zdb(8) manpage and help.

Responsible-Changed-From-To: freebsd-fs->mm
Responsible-Changed-By: mm
Responsible-Changed-When: Thu May 10 04:13:16 UTC 2012
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=167467

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 04:13:36 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B704A106566B;
	Thu, 10 May 2012 04:13:36 +0000 (UTC) (envelope-from mm@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 8A6658FC15;
	Thu, 10 May 2012 04:13:36 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4A4DahJ018300;
	Thu, 10 May 2012 04:13:36 GMT (envelope-from mm@freefall.freebsd.org)
Received: (from mm@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4A4DaeO018295;
	Thu, 10 May 2012 04:13:36 GMT (envelope-from mm)
Date: Thu, 10 May 2012 04:13:36 GMT
Message-Id: <201205100413.q4A4DaeO018295@freefall.freebsd.org>
To: mm@FreeBSD.org, freebsd-fs@FreeBSD.org, mm@FreeBSD.org
From: mm@FreeBSD.org
Cc: 
Subject: Re: kern/167370: [zfs][patch] Unnecessary break point on zfs_main.c.
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 04:13:36 -0000

Synopsis: [zfs][patch] Unnecessary break point on zfs_main.c.

Responsible-Changed-From-To: freebsd-fs->mm
Responsible-Changed-By: mm
Responsible-Changed-When: Thu May 10 04:13:36 UTC 2012
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=167370

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 04:13:45 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id C64CC1065672;
	Thu, 10 May 2012 04:13:45 +0000 (UTC) (envelope-from mm@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 9A7D28FC0C;
	Thu, 10 May 2012 04:13:45 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4A4Djkj018574;
	Thu, 10 May 2012 04:13:45 GMT (envelope-from mm@freefall.freebsd.org)
Received: (from mm@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4A4Djll018570;
	Thu, 10 May 2012 04:13:45 GMT (envelope-from mm)
Date: Thu, 10 May 2012 04:13:45 GMT
Message-Id: <201205100413.q4A4Djll018570@freefall.freebsd.org>
To: mm@FreeBSD.org, freebsd-fs@FreeBSD.org, mm@FreeBSD.org
From: mm@FreeBSD.org
Cc: 
Subject: Re: kern/167447: [zfs] [patch] patch to zfs rename -f to perform
	force unmount.
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 04:13:45 -0000

Synopsis: [zfs] [patch] patch to zfs rename -f to perform force unmount.

Responsible-Changed-From-To: freebsd-fs->mm
Responsible-Changed-By: mm
Responsible-Changed-When: Thu May 10 04:13:45 UTC 2012
Responsible-Changed-Why: 
I'll take it.

http://www.freebsd.org/cgi/query-pr.cgi?pr=167447

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 13:07:34 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 40A67106564A
	for <freebsd-fs@freebsd.org>; Thu, 10 May 2012 13:07:34 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id AC16D8FC15
	for <freebsd-fs@freebsd.org>; Thu, 10 May 2012 13:07:33 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1SST5f-0002Va-6j; Thu, 10 May 2012 16:07:31 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id C01741CC21; Thu, 10 May 2012 16:07:31 +0300 (EEST)
Date: Thu, 10 May 2012 16:07:31 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20120510130731.GA72837@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <20120507174813.GA5927@pm513-1.comsys.ntu-kpi.kiev.ua>
	<1357768784.50127.1336434018113.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1357768784.50127.1336434018113.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 28-Apr-2011 07:11:12)
X-Date: 2012-05-10 16:07:31
X-Connected-IP: 10.18.52.101:44001
X-Message-Linecount: 55
X-Body-Linecount: 39
X-Message-Size: 2410
X-Body-Size: 1685
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 13:07:34 -0000

On Mon, May 07, 2012 at 07:40:18PM -0400, Rick Macklem wrote:
> Andrey Simonenko wrote:
> > On Sun, Apr 29, 2012 at 04:36:03PM -0400, Rick Macklem wrote:
> > >
> > > Also, be sure to check "man nfsv4" and maybe reference it (it is
> > > currently
> > > in the See Also list, but that might not be strong enough).
> > 
> > There is another question not explained in documentation (I could not
> > find the answer at least). Currently NFSv3 client uses reserved port
> > for NFS mounts and uses non reserved port if "noresvport" is
> > specified.
> > NFSv4 client always uses non reserved port, ignoring the "resvport"
> > option in the mount_nfs command.
> > 
> > Such behaviour of NFS client was introduced in 1.18 version of
> > fs/nfsclient/nfs_clvfsops.c [1], where the "resvport" flag is cleared
> > for NFSv4 mounts.
> > 
> > Why does "reserved port logic" differ in NFSv3 and NFSv4 clients?
> > 
> It is my understanding that NFSv4 servers are not supposed to require
> a "reserved" port#. However, at a quick glance, I can't find that stated
> in RFC 3530. (It may be implied by the fact that NFSv4 uses a "user" based
> security model and not a "host" based one.)
> 
> As such, the client should never need to "waste" a reserved port# on a NFSv4
> connection.

Since AUTH_SYS can be used in NFSv4 as well and according to RFC 3530
AUTH_SYS in NFSv4 has the same logic as in NFSv2/3, then

1. Does "user" based security model mean RPCSEC_GSS?

2. Does "host" based security model mean AUTH_SYS?

I did not find any mention about port numbers in RFC 1813 and 3530,
looks like that ports numbers range used by NFS clients and checked by
NFS server is the implementation decision.

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 15:40:13 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id AD4ED106564A
	for <freebsd-fs@hub.freebsd.org>; Thu, 10 May 2012 15:40:13 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 7EB1A8FC08
	for <freebsd-fs@hub.freebsd.org>; Thu, 10 May 2012 15:40:13 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4AFeD0L067835
	for <freebsd-fs@freefall.freebsd.org>; Thu, 10 May 2012 15:40:13 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4AFeDdX067834;
	Thu, 10 May 2012 15:40:13 GMT (envelope-from gnats)
Date: Thu, 10 May 2012 15:40:13 GMT
Message-Id: <201205101540.q4AFeDdX067834@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: "Jukka A. Ukkonen" <jau@iki.fi>
Cc: 
Subject: kern/167612: [portalfs] The portal file system gets stuck inside
 portal_open(). ("1 extra fds")
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: "Jukka A. Ukkonen" <jau@iki.fi>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 15:40:13 -0000

The following reply was made to PR kern/167612; it has been noted by GNATS.

From: "Jukka A. Ukkonen" <jau@iki.fi>
To: bug-followup@FreeBSD.org, jau@iki.fi
Cc:  
Subject: kern/167612: [portalfs] The portal file system gets stuck inside
 portal_open(). ("1 extra fds")
Date: Thu, 10 May 2012 18:33:49 +0300

 This is a multi-part message in MIME format.
 --------------060204070501010607040700
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 
 This really was an alignment issue.
 
 The old code was not in sync with the alignment done in
 
 the CMSG_* macros.
 
 
 Find a patch attached.
 
 --jau
 
 
 --------------060204070501010607040700
 Content-Type: text/plain; charset=UTF-8;
  name="portal_vnops.c.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
  filename="portal_vnops.c.diff"
 
 --- portal_vnops.c.orig	2012-05-08 18:43:17.000000000 +0300
 +++ portal_vnops.c	2012-05-10 17:07:55.000000000 +0300
 @@ -397,19 +397,47 @@
  	 * than a single mbuf in it.  What to do?
  	 */
  	cmsg = mtod(cm, struct cmsghdr *);
 -	newfds = (cmsg->cmsg_len - sizeof(*cmsg)) / sizeof (int);
 +
 +	/*
 +	 * Just in case the sender no longer does what we expect
 +	 * and sends something else before or in the worst case
 +	 * instead of the file descriptor we expect...
 +	 */
 +
 +	if ((cmsg->cmsg_level != SOL_SOCKET)
 +	    || (cmsg->cmsg_type != SCM_RIGHTS)) {
 +		error = ECONNREFUSED;
 +		goto bad;
 +	}
 +
 +	/*
 +	 * Use the flippin' CMSG_DATA() macro to make sure we use
 +	 * the same alignment as the sender.
 +	 * Otherwise things go pear shape very easily.
 +	 * The bad news is that even faulty code may work on some
 +	 * CPU architectures.
 +	 */
 +
 +	ip = (int *) CMSG_DATA (cmsg);
 +
 +	newfds = (cmsg->cmsg_len - 
 +		  ((unsigned char *) ip -
 +		   (unsigned char *) cmsg)) / sizeof (int);
 +
  	if (newfds == 0) {
  		error = ECONNREFUSED;
  		goto bad;
  	}
 +
  	/*
  	 * At this point the rights message consists of a control message
  	 * header, followed by a data region containing a vector of
  	 * integer file descriptors.  The fds were allocated by the action
  	 * of receiving the control message.
  	 */
 -	ip = (int *) (cmsg + 1);
 +
  	fd = *ip++;
 +
  	if (newfds > 1) {
  		/*
  		 * Close extra fds.
 
 --------------060204070501010607040700--

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 20:34:17 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6F0AC106566C
	for <freebsd-fs@freebsd.org>; Thu, 10 May 2012 20:34:17 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 28E108FC0C
	for <freebsd-fs@freebsd.org>; Thu, 10 May 2012 20:34:17 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap4EAGMlrE+DaFvO/2dsb2JhbABEhXavMIIVAQEEASNWBRYOCgICDRkCWQYTiAkFqFiTAYEviWOFBYEYBJV9kECDBQ
X-IronPort-AV: E=Sophos;i="4.75,566,1330923600"; d="scan'208";a="168766046"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 10 May 2012 16:34:16 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DE70F7941E;
	Thu, 10 May 2012 16:34:15 -0400 (EDT)
Date: Thu, 10 May 2012 16:34:15 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Message-ID: <901330725.234130.1336682055896.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20120510130731.GA72837@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.203]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 20:34:17 -0000

Andrey Simonenko wrote:
> On Mon, May 07, 2012 at 07:40:18PM -0400, Rick Macklem wrote:
> > Andrey Simonenko wrote:
> > > On Sun, Apr 29, 2012 at 04:36:03PM -0400, Rick Macklem wrote:
> > > >
> > > > Also, be sure to check "man nfsv4" and maybe reference it (it is
> > > > currently
> > > > in the See Also list, but that might not be strong enough).
> > >
> > > There is another question not explained in documentation (I could
> > > not
> > > find the answer at least). Currently NFSv3 client uses reserved
> > > port
> > > for NFS mounts and uses non reserved port if "noresvport" is
> > > specified.
> > > NFSv4 client always uses non reserved port, ignoring the
> > > "resvport"
> > > option in the mount_nfs command.
> > >
> > > Such behaviour of NFS client was introduced in 1.18 version of
> > > fs/nfsclient/nfs_clvfsops.c [1], where the "resvport" flag is
> > > cleared
> > > for NFSv4 mounts.
> > >
> > > Why does "reserved port logic" differ in NFSv3 and NFSv4 clients?
> > >
> > It is my understanding that NFSv4 servers are not supposed to
> > require
> > a "reserved" port#. However, at a quick glance, I can't find that
> > stated
> > in RFC 3530. (It may be implied by the fact that NFSv4 uses a "user"
> > based
> > security model and not a "host" based one.)
> >
> > As such, the client should never need to "waste" a reserved port# on
> > a NFSv4
> > connection.
> 
> Since AUTH_SYS can be used in NFSv4 as well and according to RFC 3530
> AUTH_SYS in NFSv4 has the same logic as in NFSv2/3, then
> 
> 1. Does "user" based security model mean RPCSEC_GSS?
> 
> 2. Does "host" based security model mean AUTH_SYS?
> 
My guess is that AUTH_SYS is not considered a security model at all,
but the "authenticators" refer to users.

I believe the "host" based security model referred to in the RFCs refers
to the restrictions implemented by /etc/exports, based on client host IP
addresses.

I do remember that the IETF working group discussed "reserved port #s" and
agreed that requiring one did not enhance security and that NFSv4 servers
should not require that a client's port# be within a certain range.
(If you were to search the archive for nfsv4@ietf.org, it should be somewhere
 in there.)

However, I agree that this does not seem to be stated in the RFCs, because
I couldn't find it when the question came up. (It may be that IETF does not
have a definition of a "reserved port#".)

Personally, I agree with the working group and have always thought requiring
a client to use a "reserved port#" was meaningless. However, I already noted
that I don't mind enabling it, with a comment that it should not be required
for NFSv4.

> I did not find any mention about port numbers in RFC 1813 and 3530,
> looks like that ports numbers range used by NFS clients and checked by
> NFS server is the implementation decision.

During interoperability testing (I'll be at another NFSv4 Bakeathon in June)
I have never had a server that would not allow a connection to happen from
a non-reserved port# for NFSv4, so I believe that the implementation practice is to
not require it for NFSv4. (Consistent with the discussion on nfsv4@ietf.org.)

rick


From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 21:13:40 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 09A7F1065670;
	Thu, 10 May 2012 21:13:40 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 8C9998FC16;
	Thu, 10 May 2012 21:13:39 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap8EANAurE+DaFvO/2dsb2JhbABEhXavMIIVAQEBAwEBAQEgKyALBRYOCgICDRkCKQEJJgYIBwQBHASHaAULqEWSfoEviWMZBIRogRgEk0+CLoERjy+DBYE6AQgR
X-IronPort-AV: E=Sophos;i="4.75,567,1330923600"; d="scan'208";a="168771777"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 10 May 2012 17:13:38 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 95B95B3F89;
	Thu, 10 May 2012 17:13:38 -0400 (EDT)
Date: Thu, 10 May 2012 17:13:38 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Andrew Leonard <lists@hurricane-ridge.com>
Message-ID: <1446179418.236280.1336684418582.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CADUQDp9ytTTUqRvqzySBfugkqL56okEgZOOs_vvbKmOYi=mL0Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: Unable to set ACLs on ZFS file system over NFSv4?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 21:13:40 -0000

Andrew Leonard wrote:
> I have a ZFS file system on which I can successfully manipulate ACLs
> locally, but am unable to do so when it is mounted remotely using
> NFSv4 on both FreeBSD and Linux (CentOS 5) clients.
> 
> The system in question is running 8-STABLE:
> 
> FreeBSD zfs07.example.com 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Nov 17
> 17:46:00 PST 2011
> root@zfs07.example.com:/usr/obj/usr/src/sys/GENERIC amd64
> 
> ACLs can be successfully manipulated locally; e.g. the following
> returns no error and works as expected:
> 
> > setfacl -m g:group2:rwxpDaRWcs:fd:allow /tank01/ngs/test.dir
> 
> The file system is exported as follows in /etc/exports:
> 
> /tank01/ngs -sec=sys
> V4: /tank01 -sec=sys
> 
> On the FreeBSD client, it is mounted using NFSv4, and behaves as
> follows under the same user (sanitized to "user1", who is in
> "group1"):
> 
> > whoami
> user1
> > groups
> group1 [...]
> > mount | grep /mnt
> zfs07b:/ngs on /mnt (newnfs, nfsv4acls)
> >  getfacl /mnt/test2.dir
> # file: /mnt/test2.dir
> # owner: user1
> # group: group1
> group:group1:rwxpDdaARWcCo-:fd----:allow
> owner@:rwxp--aARWcCo-:------:allow
> group@:r-x---a-R-c---:------:allow
> everyone@:r-x---a-R-c---:------:allow
> > setfacl -m g:group2:rwxpDaRWcs:fd:allow /mnt/test2.dir
> setfacl: /mnt/test2.dir: acl_set_file() failed: Input/output error
> 
> In all other respects, ACLs appear to be honored over NFSv4 - the user
> can access, create, modify and delete files as expected, and ACLs are
> appropriately inherited - the ACLs just cannot be manipulated.
> 
> Linux client behavior is functionally identical:
> 
> > mount | grep /mnt
> zfs07b:/ngs on /mnt type nfs4 (rw,addr=192.168.x.y)
> > nfs4_setfacl -a A:gfd:group2:rwxaDdtnNcy test2.dir
> Failed setxattr operation: Input/output error
> 
> Is this a misconfiguration on my part, a known limitation, or a bug?
> 
As far as I know, it should work. I only use UFS, but my understanding
is that ZFS always supports NFSv4 ACLs.

If you capture a packet trace from before you do the NFSv4 mount, I can
take a look and see what the server is saying. (Basically, at mount time
a reply to a Getattr should including the supported attributes and that
should include the ACL bit. Then the setfacl becomes a Setattr of the ACL
attribute.)
# tcpdump -s 0 -w acl.pcap host <server>
- run on the client should do it

If you want to look at it, use wireshark. If you want me to look, just
email acl.pcap as an attachment.

rick
ps: Although I suspect it is the server that isn't behaving, please use
    the FreeBSD client for the above.
pss: I've cc'd trasz@ in case he can spot some reason why it wouldn't work.

> More details:
> 
> > zfs get version tank01/ngs
> NAME PROPERTY VALUE SOURCE
> tank01/ngs version 5 -
> > zpool get version tank01
> NAME PROPERTY VALUE SOURCE
> tank01 version 28 default
> > zfs get all tank01/ngs
> NAME PROPERTY VALUE SOURCE
> tank01/ngs type filesystem -
> tank01/ngs creation Tue May 1 16:15 2012 -
> tank01/ngs used 61.6G -
> tank01/ngs available 4.47T -
> tank01/ngs referenced 33.8G -
> tank01/ngs compressratio 4.23x -
> tank01/ngs mounted yes -
> tank01/ngs quota none default
> tank01/ngs reservation none default
> tank01/ngs recordsize 128K default
> tank01/ngs mountpoint /tank01/ngs default
> tank01/ngs sharenfs off default
> tank01/ngs checksum on default
> tank01/ngs compression gzip local
> tank01/ngs atime on default
> tank01/ngs devices on default
> tank01/ngs exec on default
> tank01/ngs setuid off inherited from tank01
> tank01/ngs readonly off default
> tank01/ngs jailed off default
> tank01/ngs snapdir hidden default
> tank01/ngs aclmode passthrough local
> tank01/ngs aclinherit passthrough-x local
> tank01/ngs canmount on default
> tank01/ngs xattr off temporary
> tank01/ngs copies 1 default
> tank01/ngs version 5 -
> tank01/ngs utf8only off -
> tank01/ngs normalization none -
> tank01/ngs casesensitivity sensitive -
> tank01/ngs vscan off default
> tank01/ngs nbmand off default
> tank01/ngs sharesmb off default
> tank01/ngs refquota none default
> tank01/ngs refreservation none default
> tank01/ngs primarycache all default
> tank01/ngs secondarycache all default
> tank01/ngs usedbysnapshots 27.8G -
> tank01/ngs usedbydataset 33.8G -
> tank01/ngs usedbychildren 0 -
> tank01/ngs usedbyrefreservation 0 -
> tank01/ngs logbias latency default
> tank01/ngs dedup off default
> tank01/ngs mlslabel -
> tank01/ngs sync standard default
> tank01/ngs refcompressratio 4.14x -
> > egrep 'nfs|zfs' /etc/rc.conf.local
> nfscbd_enable="YES"
> nfs_client_enable="YES"
> nfsuserd_enable="YES"
> nfsv4_server_enable="YES"
> nfs_server_enable="YES"
> zfs_enable="YES"
> 
> Thanks,
> Andy
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Thu May 10 21:23:13 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id CFD881065678
	for <freebsd-fs@freebsd.org>; Thu, 10 May 2012 21:23:13 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 74A2D8FC12
	for <freebsd-fs@freebsd.org>; Thu, 10 May 2012 21:23:13 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap8EACIxrE+DaFvO/2dsb2JhbABEhXavMIIVAQEBAwEBAQEgKyALGw4KAgINGQIpAQkmBggHBAEcBIdoBQuoSpJ9gS+JYxQFBIRogRgEk0+CLoERjy+DBYE6AQgR
X-IronPort-AV: E=Sophos;i="4.75,567,1330923600"; d="scan'208";a="168772960"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 10 May 2012 17:23:12 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 92330B40D6;
	Thu, 10 May 2012 17:23:12 -0400 (EDT)
Date: Thu, 10 May 2012 17:23:12 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Andrew Leonard <lists@hurricane-ridge.com>
Message-ID: <353146957.236642.1336684992583.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <1446179418.236280.1336684418582.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: Unable to set ACLs on ZFS file system over NFSv4?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 May 2012 21:23:13 -0000

I wrote:
> Andrew Leonard wrote:
> > I have a ZFS file system on which I can successfully manipulate ACLs
> > locally, but am unable to do so when it is mounted remotely using
> > NFSv4 on both FreeBSD and Linux (CentOS 5) clients.
> >
> > The system in question is running 8-STABLE:
> >
> > FreeBSD zfs07.example.com 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Nov
> > 17
> > 17:46:00 PST 2011
> > root@zfs07.example.com:/usr/obj/usr/src/sys/GENERIC amd64
> >
> > ACLs can be successfully manipulated locally; e.g. the following
> > returns no error and works as expected:
> >
> > > setfacl -m g:group2:rwxpDaRWcs:fd:allow /tank01/ngs/test.dir
> >
> > The file system is exported as follows in /etc/exports:
> >
> > /tank01/ngs -sec=sys
> > V4: /tank01 -sec=sys
> >
> > On the FreeBSD client, it is mounted using NFSv4, and behaves as
> > follows under the same user (sanitized to "user1", who is in
> > "group1"):
> >
> > > whoami
> > user1
> > > groups
> > group1 [...]
> > > mount | grep /mnt
> > zfs07b:/ngs on /mnt (newnfs, nfsv4acls)
> > >  getfacl /mnt/test2.dir
> > # file: /mnt/test2.dir
> > # owner: user1
> > # group: group1
> > group:group1:rwxpDdaARWcCo-:fd----:allow
> > owner@:rwxp--aARWcCo-:------:allow
> > group@:r-x---a-R-c---:------:allow
> > everyone@:r-x---a-R-c---:------:allow
> > > setfacl -m g:group2:rwxpDaRWcs:fd:allow /mnt/test2.dir
> > setfacl: /mnt/test2.dir: acl_set_file() failed: Input/output error
> >
> > In all other respects, ACLs appear to be honored over NFSv4 - the
> > user
> > can access, create, modify and delete files as expected, and ACLs
> > are
> > appropriately inherited - the ACLs just cannot be manipulated.
> >
> > Linux client behavior is functionally identical:
> >
> > > mount | grep /mnt
> > zfs07b:/ngs on /mnt type nfs4 (rw,addr=192.168.x.y)
> > > nfs4_setfacl -a A:gfd:group2:rwxaDdtnNcy test2.dir
> > Failed setxattr operation: Input/output error
> >
> > Is this a misconfiguration on my part, a known limitation, or a bug?
> >
> As far as I know, it should work. I only use UFS, but my understanding
> is that ZFS always supports NFSv4 ACLs.
> 
> If you capture a packet trace from before you do the NFSv4 mount, I
> can
> take a look and see what the server is saying. (Basically, at mount
> time
> a reply to a Getattr should including the supported attributes and
> that
> should include the ACL bit. Then the setfacl becomes a Setattr of the
> ACL
> attribute.)
> # tcpdump -s 0 -w acl.pcap host <server>
> - run on the client should do it
> 
> If you want to look at it, use wireshark. If you want me to look, just
> email acl.pcap as an attachment.
> 
> rick
> ps: Although I suspect it is the server that isn't behaving, please
> use
> the FreeBSD client for the above.
> pss: I've cc'd trasz@ in case he can spot some reason why it wouldn't
> work.
> 
Oh, and make sure "user1" isn't in more than 16 groups, because that is the
limit for AUTH_SYS. (I'm not sure what the effect of user1 being in more
than 16 groups would be, but might as well eliminate it as a cause.)

> > More details:
> >
> > > zfs get version tank01/ngs
> > NAME PROPERTY VALUE SOURCE
> > tank01/ngs version 5 -
> > > zpool get version tank01
> > NAME PROPERTY VALUE SOURCE
> > tank01 version 28 default
> > > zfs get all tank01/ngs
> > NAME PROPERTY VALUE SOURCE
> > tank01/ngs type filesystem -
> > tank01/ngs creation Tue May 1 16:15 2012 -
> > tank01/ngs used 61.6G -
> > tank01/ngs available 4.47T -
> > tank01/ngs referenced 33.8G -
> > tank01/ngs compressratio 4.23x -
> > tank01/ngs mounted yes -
> > tank01/ngs quota none default
> > tank01/ngs reservation none default
> > tank01/ngs recordsize 128K default
> > tank01/ngs mountpoint /tank01/ngs default
> > tank01/ngs sharenfs off default
> > tank01/ngs checksum on default
> > tank01/ngs compression gzip local
> > tank01/ngs atime on default
> > tank01/ngs devices on default
> > tank01/ngs exec on default
> > tank01/ngs setuid off inherited from tank01
> > tank01/ngs readonly off default
> > tank01/ngs jailed off default
> > tank01/ngs snapdir hidden default
> > tank01/ngs aclmode passthrough local
> > tank01/ngs aclinherit passthrough-x local
> > tank01/ngs canmount on default
> > tank01/ngs xattr off temporary
> > tank01/ngs copies 1 default
> > tank01/ngs version 5 -
> > tank01/ngs utf8only off -
> > tank01/ngs normalization none -
> > tank01/ngs casesensitivity sensitive -
> > tank01/ngs vscan off default
> > tank01/ngs nbmand off default
> > tank01/ngs sharesmb off default
> > tank01/ngs refquota none default
> > tank01/ngs refreservation none default
> > tank01/ngs primarycache all default
> > tank01/ngs secondarycache all default
> > tank01/ngs usedbysnapshots 27.8G -
> > tank01/ngs usedbydataset 33.8G -
> > tank01/ngs usedbychildren 0 -
> > tank01/ngs usedbyrefreservation 0 -
> > tank01/ngs logbias latency default
> > tank01/ngs dedup off default
> > tank01/ngs mlslabel -
> > tank01/ngs sync standard default
> > tank01/ngs refcompressratio 4.14x -
> > > egrep 'nfs|zfs' /etc/rc.conf.local
> > nfscbd_enable="YES"
> > nfs_client_enable="YES"
> > nfsuserd_enable="YES"
> > nfsv4_server_enable="YES"
> > nfs_server_enable="YES"
> > zfs_enable="YES"
> >
> > Thanks,
> > Andy
> > _______________________________________________
> > freebsd-fs@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to
> > "freebsd-fs-unsubscribe@freebsd.org"
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

From owner-freebsd-fs@FreeBSD.ORG  Fri May 11 08:25:56 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F07D11065673
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 08:25:56 +0000 (UTC)
	(envelope-from karl.oulmi@ibl.fr)
Received: from marisse.ibl.fr (marisse.ibl.fr [193.49.178.19])
	by mx1.freebsd.org (Postfix) with ESMTP id 939178FC1C
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 08:25:56 +0000 (UTC)
X-Virus-Scanned: amavisd-new at ibl.fr
Message-ID: <4FACCAEB.8040401@ibl.fr>
Date: Fri, 11 May 2012 10:16:43 +0200
From: Karl Oulmi <karl.oulmi@ibl.fr>
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
	micalg=sha1; boundary="------------ms040105020606070700040406"
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Best practice for shared volume with iscsi Dell MD3200i ?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 May 2012 08:25:57 -0000

This is a cryptographically signed message in MIME format.

--------------ms040105020606070700040406
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi all,

I am trying to run two freebsd9 boxes with a 3.7 TO shared iscsi volume=20
on a MD3200i.

The goal is to run a "master" and a "slave" dovecot IMAP server with a=20
shared /home.

I created the shared partition like this :
gpart create -s gpt /dev/da0
gpart add -t freebsd-ufs /dev/da0
newfs /dev/da0p1

Everything is working great on the "master" server, but when I'm trying=20
to mount the volume from the "slave" one, I have the following error :
mount: /dev/da0p1 : Operation not permitted

The only way I have to successfully mount the share on the "slave"=20
server is to run a fsck -t ufs /dev/da0p1 and then do the mount.

Could anyone tell me what's wrong ?

Regards,
Karl


--------------ms040105020606070700040406--

From owner-freebsd-fs@FreeBSD.ORG  Fri May 11 12:20:23 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id ED8B5106564A
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 12:20:22 +0000 (UTC)
	(envelope-from simon@comsys.ntu-kpi.kiev.ua)
Received: from comsys.kpi.ua (comsys.kpi.ua [77.47.192.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 64E7C8FC0C
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 12:20:22 +0000 (UTC)
Received: from pm513-1.comsys.kpi.ua ([10.18.52.101]
	helo=pm513-1.comsys.ntu-kpi.kiev.ua)
	by comsys.kpi.ua with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63)
	(envelope-from <simon@comsys.ntu-kpi.kiev.ua>)
	id 1SSopY-0006BP-2p; Fri, 11 May 2012 15:20:20 +0300
Received: by pm513-1.comsys.ntu-kpi.kiev.ua (Postfix, from userid 1001)
	id A3AF91CC34; Fri, 11 May 2012 15:20:20 +0300 (EEST)
Date: Fri, 11 May 2012 15:20:20 +0300
From: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20120511122020.GA13906@pm513-1.comsys.ntu-kpi.kiev.ua>
References: <20120510130731.GA72837@pm513-1.comsys.ntu-kpi.kiev.ua>
	<901330725.234130.1336682055896.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <901330725.234130.1336682055896.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Authenticated-User: simon@comsys.ntu-kpi.kiev.ua
X-Authenticator: plain
X-Sender-Verify: SUCCEEDED (sender exists & accepts mail)
X-Exim-Version: 4.63 (build at 28-Apr-2011 07:11:12)
X-Date: 2012-05-11 15:20:20
X-Connected-IP: 10.18.52.101:52027
X-Message-Linecount: 78
X-Body-Linecount: 62
X-Message-Size: 3530
X-Body-Size: 2804
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 May 2012 12:20:23 -0000

On Thu, May 10, 2012 at 04:34:15PM -0400, Rick Macklem wrote:
> Andrey Simonenko wrote:

> > > in RFC 3530. (It may be implied by the fact that NFSv4 uses a "user"
> > > based
> > > security model and not a "host" based one.)
> > >
> > > As such, the client should never need to "waste" a reserved port# on
> > > a NFSv4
> > > connection.
> > 
> > Since AUTH_SYS can be used in NFSv4 as well and according to RFC 3530
> > AUTH_SYS in NFSv4 has the same logic as in NFSv2/3, then
> > 
> > 1. Does "user" based security model mean RPCSEC_GSS?
> > 
> > 2. Does "host" based security model mean AUTH_SYS?
> > 
> My guess is that AUTH_SYS is not considered a security model at all,
> but the "authenticators" refer to users.

Probably I wrongly asked the question.  I did not mean that some security
flavor (eg. AUTH_SYS) is a security model.  I wanted to say that NFSv4
allows to use AUTH_SYS security flavor and user credentials are given
as is by client's machine, so some form of control by client's IP address
is required by the NFSv4 server if a client uses and is allowed to use
AUTH_SYS security flavor.  Actually this is specified in "16. Security
Considerations" from RFC 3530 and AUTH_SYS in NFSv4 is called << "classic"
model of machine authentication via IP address checking >>.

What do you think about the following idea about configuration?

1. The NFS server for NFSv2/3 clients allows to specify whether their
   MOUNT MNT, UMNT and UMNTALL RPC requests have to or do not have to come
   from reserved ports.

2. The NFS server for NFSv2/3/4 clients allows to specify whether their
   NFS RPC calls:
	a) do not have to come from reserved ports
	b) always have to come from reserved ports
	c) have to come from reserved ports if clients use AUTH_SYS.

3. By default reserved ports are not required for MOUNT RPC and
   NFS RPC calls.  Corresponding options can be used for entire file
   system and/or for single address specification.

First item obviously is checked in a user space and second item is checked
in the NFS server somewhere after VFS_CHECKEXP() when the server decides
which security flavor to use.

NetBSD already has -noresvmnt and -noresvport options in their exports(5).

> Personally, I agree with the working group and have always thought requiring
> a client to use a "reserved port#" was meaningless. However, I already noted
> that I don't mind enabling it, with a comment that it should not be required
> for NFSv4.

If a client machine is trusted, then reserved ports can guaranty that
requests come from privileged processes and not from user space where
client can fill any credentials in AUTH_SYS.  If client machine is not
trusted, then this will not work of course.  BTW mountd requires reserved
port and NFS server does not required reserved port by default.

From owner-freebsd-fs@FreeBSD.ORG  Fri May 11 15:52:28 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4E55B1065672
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 15:52:28 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 1AF4C8FC0C
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 15:52:28 +0000 (UTC)
Received: from gjp by noop.in-addr.com with local (Exim 4.77 (FreeBSD))
	(envelope-from <gpalmer@freebsd.org>)
	id 1SSs8I-000GNF-1r; Fri, 11 May 2012 11:51:54 -0400
Date: Fri, 11 May 2012 11:51:53 -0400
From: Gary Palmer <gpalmer@freebsd.org>
To: Karl Oulmi <karl.oulmi@ibl.fr>
Message-ID: <20120511155153.GA31698@in-addr.com>
References: <4FACCAEB.8040401@ibl.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4FACCAEB.8040401@ibl.fr>
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: gpalmer@freebsd.org
X-SA-Exim-Scanned: No (on noop.in-addr.com); SAEximRunCond expanded to false
Cc: freebsd-fs@freebsd.org
Subject: Re: Best practice for shared volume with iscsi Dell MD3200i ?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 May 2012 15:52:28 -0000

On Fri, May 11, 2012 at 10:16:43AM +0200, Karl Oulmi wrote:
> Hi all,
> 
> I am trying to run two freebsd9 boxes with a 3.7 TO shared iscsi volume 
> on a MD3200i.
> 
> The goal is to run a "master" and a "slave" dovecot IMAP server with a 
> shared /home.
> 
> I created the shared partition like this :
> gpart create -s gpt /dev/da0
> gpart add -t freebsd-ufs /dev/da0
> newfs /dev/da0p1
> 
> Everything is working great on the "master" server, but when I'm trying 
> to mount the volume from the "slave" one, I have the following error :
> mount: /dev/da0p1 : Operation not permitted
> 
> The only way I have to successfully mount the share on the "slave" 
> server is to run a fsck -t ufs /dev/da0p1 and then do the mount.
> 
> Could anyone tell me what's wrong ?

UFS is not a cluster-aware filesystem.  You cannot mount it in
multiple places at the same time.  The best you can hope for in
that situation, short of developing a cluster-aware filesystem, is
to only mount the volume on the slave if the master fails.

Regards,

Gary

From owner-freebsd-fs@FreeBSD.ORG  Fri May 11 21:20:45 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id E60731065670
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 21:20:45 +0000 (UTC)
	(envelope-from lists@hurricane-ridge.com)
Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com
	[209.85.210.54])
	by mx1.freebsd.org (Postfix) with ESMTP id B48038FC08
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 21:20:45 +0000 (UTC)
Received: by dadv36 with SMTP id v36so4144504dad.13
	for <freebsd-fs@freebsd.org>; Fri, 11 May 2012 14:20:45 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:x-originating-ip:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type:x-gm-message-state;
	bh=MCU4wzuib4VSpaCG4fjW420lc6gh9aPEqN5H5WOH5NQ=;
	b=mGE2rAQyJbD4Tur8L6mhcSWbolGv9w5GT8gp2aIqgMoUv+dMr01onVgSkLtyYg7Ujg
	RrAUw2AmpxDrViKLxj+CmKb/qeWOAJdsip+qQC6o0+e5rNGBf0sYKbgjFhaqOKRESRPP
	gmbU89FPn5kMhji7a38GPm8ZZqIcZ2ygu9hpx8h5UhFb8jZ04xdU729MQ0SsPqhv706n
	WH6iZVWla9ojtBTbp5MdKYSsmg6ykzDhpCUPqcFOt11t0auLBe8sPmWfFeGeY5JCCuBs
	qygtWpNMfsC5Q/+pS5dIOubeWRFoSgTIsozz9n3fslcTDgxsEezWikxUNMPoXUy49SDd
	fO2Q==
MIME-Version: 1.0
Received: by 10.68.231.195 with SMTP id ti3mr34901066pbc.96.1336771245287;
	Fri, 11 May 2012 14:20:45 -0700 (PDT)
Received: by 10.68.195.166 with HTTP; Fri, 11 May 2012 14:20:45 -0700 (PDT)
X-Originating-IP: [209.124.184.194]
In-Reply-To: <353146957.236642.1336684992583.JavaMail.root@erie.cs.uoguelph.ca>
References: <1446179418.236280.1336684418582.JavaMail.root@erie.cs.uoguelph.ca>
	<353146957.236642.1336684992583.JavaMail.root@erie.cs.uoguelph.ca>
Date: Fri, 11 May 2012 14:20:45 -0700
Message-ID: <CADUQDp-QHqXtRtTQfm4y7sEZhZeesR0=WBiUWP39XUzr92gUXg@mail.gmail.com>
From: Andrew Leonard <lists@hurricane-ridge.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQnHvOhbcOnxHx0EAENoHxOlS0H4JyfF1nnvGoAdjtifQ7tb8u3uTRcTm0KpbP+vJOwkaMu5
Cc: freebsd-fs@freebsd.org
Subject: Re: Unable to set ACLs on ZFS file system over NFSv4?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 May 2012 21:20:46 -0000

On Thu, May 10, 2012 at 2:23 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:

> I wrote:

>> If you capture a packet trace from before you do the NFSv4 mount, I
>> can
>> take a look and see what the server is saying. (Basically, at mount
>> time
>> a reply to a Getattr should including the supported attributes and
>> that
>> should include the ACL bit. Then the setfacl becomes a Setattr of the
>> ACL
>> attribute.)
>> # tcpdump -s 0 -w acl.pcap host <server>
>> - run on the client should do it
>>
>> If you want to look at it, use wireshark. If you want me to look, just
>> email acl.pcap as an attachment.
>>
>> rick
>> ps: Although I suspect it is the server that isn't behaving, please
>> use
>> the FreeBSD client for the above.
>> pss: I've cc'd trasz@ in case he can spot some reason why it wouldn't
>> work.
>>
> Oh, and make sure "user1" isn't in more than 16 groups, because that is the
> limit for AUTH_SYS. (I'm not sure what the effect of user1 being in more
> than 16 groups would be, but might as well eliminate it as a cause.)

Thanks, Rick - I'll send the pcap over private email, as I'm sure
$DAYJOB would consider it somewhat sensitive.

Looking in wireshark, if I'm reading it correctly, I don't see
anything for FATTR4_ACL in any replies.  On the final connection, I do
see NFS4ERR_IO set as the status for the reply to the setattr - but
from Googling, my understanding is that response is supposed to
indicate a hard error, such as a hardware problem.

Also, I have verified that "user1" is not a member of more than 16
groups, so we can rule that out - that user is in only three groups.

-Andy

From owner-freebsd-fs@FreeBSD.ORG  Fri May 11 21:50:17 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7FCB4106564A
	for <freebsd-fs@hub.freebsd.org>; Fri, 11 May 2012 21:50:17 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 5F7E78FC1D
	for <freebsd-fs@hub.freebsd.org>; Fri, 11 May 2012 21:50:17 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4BLoHbK097624
	for <freebsd-fs@freefall.freebsd.org>; Fri, 11 May 2012 21:50:17 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4BLoHUD097623;
	Fri, 11 May 2012 21:50:17 GMT (envelope-from gnats)
Date: Fri, 11 May 2012 21:50:17 GMT
Message-Id: <201205112150.q4BLoHUD097623@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: Jeff Kletsky <freebsd@wagsky.com>
Cc: 
Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown / reboot
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Jeff Kletsky <freebsd@wagsky.com>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 May 2012 21:50:17 -0000

The following reply was made to PR kern/167685; it has been noted by GNATS.

From: Jeff Kletsky <freebsd@wagsky.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown / reboot
Date: Fri, 11 May 2012 14:41:03 -0700

 This is a multi-part message in MIME format.
 --------------020209050805030409070009
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 Problem can be replicated by booting of a "memstick" (with a "spare" USB 
 stick as /dev/da1) and then executing
 
 # dd if=/dev/zer of=/dev/da1 bs=64k
 # zpool create stick /dev/da1
 # reboot
 
 Problem has been reliably reproduced on the Atom 330 previously 
 mentioned, as well as on an AMD A8-3870 with A75 chipset. It also can be 
 replicated using VirtualBox running under Ubuntu on the AMD A8-3870 
 system. It does not seem specific to one "flavor" of USB controller or 
 driver.
 
 Using /usr/src/release/generate_release.sh and bisection, I have 
 confirmed that
 
 * r227445 does not exhibit the behavior ("Copy stable/9 to releng/9.0 as 
 part of the FreeBSD 9.0-RELEASE release cycle)
 * r229097 does not exhibit the behavior
 * r229281 -- FAIL by not rebooting under the conditions described above.
 
 Based on these results, I am suspicious of
 
 r229100 | hselasky | 2011-12-31 06:33:15 -0800 (Sat, 31 Dec 2011) | 6 lines
 
 MFC r228709, r228711 and r228723:
 - Add missing unlock of USB controller's lock, when
 doing shutdown, suspend and resume.
 - Add code to wait for USB shutdown to be executed at system shutdown.
 - Add sysctl which can be used to skip this waiting.
 
 
 as being what brought the issue to the forefront.
 
 I am presently building r229099 and r229100 to confirm this suspicion.
 
 A potential, though untested workaround would be
 # sysctl hw.usb.no_shutdown_wait=1
 
 
 --------------020209050805030409070009
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 
 <html>
   <head>
 
     <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
   </head>
   <body bgcolor="#FFFFFF" text="#000000">
     Problem can be replicated by booting of a "memstick" (with a "spare"
     USB stick as /dev/da1) and then executing<br>
     <br>
     # dd if=/dev/zer of=/dev/da1 bs=64k<br>
     # zpool create stick /dev/da1<br>
     # reboot<br>
     <br>
     Problem has been reliably reproduced on the Atom 330 previously
     mentioned, as well as on an AMD A8-3870 with A75 chipset. It also
     can be replicated using VirtualBox running under Ubuntu on the AMD
     A8-3870 system. It does not seem specific to one "flavor" of USB
     controller or driver.<br>
     <br>
     Using /usr/src/release/generate_release.sh and bisection, I have
     confirmed that<br>
     <br>
     * r227445 does not exhibit the behavior ("Copy stable/9 to releng/9.0
     as part of the FreeBSD 9.0-RELEASE release cycle)<br>
     * r229097 does not exhibit the behavior<br>
     * r229281 -- FAIL by not rebooting under the conditions described
     above.<br>
     <br>
     Based on these results, I am suspicious of <br>
     <br>
     <meta http-equiv="content-type" content="text/html;
       charset=ISO-8859-1">
     <pre class="alt2" dir="ltr" style="background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: rgb(249, 249, 249); color: rgb(0, 0, 0); font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; margin-top: 0px; margin-right : 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 6px; padding-right: 6px; padding-bottom: 6px; padding-left: 6px; border-top-width: 1px; border-right-width: 1px; border-bottom-width: 1px; border-left-width: 1px; border-top-style: inset; border-right-style: inset; border-bottom-style: inset; border-left-style: inset; border-color: initial; border-image: initial; width: auto; height: 130px; text-align: left; overflow-x: auto; overflow-y: auto; background-position: initial i
 ni
  t
 ial; background-repeat: initial initial; ">r229100 | hselasky | 2011-12-31 06:33:15 -0800 (Sat, 31 Dec 2011) | 6 lines
 
 MFC r228709, r228711 and r228723:
 - Add missing unlock of USB controller's lock, when
 doing shutdown, suspend and resume.
 - Add code to wait for USB shutdown to be executed at system shutdown.
 - Add sysctl which can be used to skip this waiting.</pre>
     <br>
     as being what brought the issue to the forefront.<br>
     <br>
     I am presently building r229099 and r229100 to confirm this
     suspicion.<br>
     <br>
     A potential, though untested workaround would be<br>
     # sysctl hw.usb.no_shutdown_wait=1<br>
     <br>
     <br>
   </body>
 </html>
 
 --------------020209050805030409070009--

From owner-freebsd-fs@FreeBSD.ORG  Fri May 11 22:32:16 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 10AEF106564A;
	Fri, 11 May 2012 22:32:16 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 160C48FC0A;
	Fri, 11 May 2012 22:32:14 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id BAA24015;
	Sat, 12 May 2012 01:32:12 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
	by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1SSyNf-00056b-Lu; Sat, 12 May 2012 01:32:11 +0300
Message-ID: <4FAD9368.5010008@FreeBSD.org>
Date: Sat, 12 May 2012 01:32:08 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: freebsd-hackers@FreeBSD.org, freebsd-fs@FreeBSD.org
References: <4F8999D2.1080902@FreeBSD.org> <4F8E820B.6080400@FreeBSD.org>
In-Reply-To: <4F8E820B.6080400@FreeBSD.org>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=x-viet-vps
Content-Transfer-Encoding: 7bit
Cc: 
Subject: Re: [review request] zfsboot/zfsloader: support accessing
 filesystems within a pool
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 May 2012 22:32:16 -0000


After all the preparatory changes are committed, this is a final[*]
notice/warning that I am going to start committing the following patchset really
soon now[**[:
http://people.freebsd.org/~avg/zfsboot.patches.9.diff

[*] unless circumstances change
[**] maybe next hour, even
-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 01:45:17 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A55B5106566B
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 01:45:17 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 5D9B08FC14
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 01:45:17 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap4EAI2/rU+DaFvO/2dsb2JhbABEhXmufoIVAQEEASNWBRYOCgICDRkCWQaIHAWoRJJLgS+JaIRwgRgElX2QQIMF
X-IronPort-AV: E=Sophos;i="4.75,574,1330923600"; d="scan'208";a="168933545"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 11 May 2012 21:45:11 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 32772B3EFE;
	Fri, 11 May 2012 21:45:11 -0400 (EDT)
Date: Fri, 11 May 2012 21:45:11 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Andrey Simonenko <simon@comsys.ntu-kpi.kiev.ua>
Message-ID: <1493074817.296570.1336787111152.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20120511122020.GA13906@pm513-1.comsys.ntu-kpi.kiev.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: NFSv4 Questions
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 01:45:17 -0000

Andrey Simonenko wrote:
> On Thu, May 10, 2012 at 04:34:15PM -0400, Rick Macklem wrote:
> > Andrey Simonenko wrote:
> 
> > > > in RFC 3530. (It may be implied by the fact that NFSv4 uses a
> > > > "user"
> > > > based
> > > > security model and not a "host" based one.)
> > > >
> > > > As such, the client should never need to "waste" a reserved
> > > > port# on
> > > > a NFSv4
> > > > connection.
> > >
> > > Since AUTH_SYS can be used in NFSv4 as well and according to RFC
> > > 3530
> > > AUTH_SYS in NFSv4 has the same logic as in NFSv2/3, then
> > >
> > > 1. Does "user" based security model mean RPCSEC_GSS?
> > >
> > > 2. Does "host" based security model mean AUTH_SYS?
> > >
> > My guess is that AUTH_SYS is not considered a security model at all,
> > but the "authenticators" refer to users.
> 
> Probably I wrongly asked the question. I did not mean that some
> security
> flavor (eg. AUTH_SYS) is a security model. I wanted to say that NFSv4
> allows to use AUTH_SYS security flavor and user credentials are given
> as is by client's machine, so some form of control by client's IP
> address
> is required by the NFSv4 server if a client uses and is allowed to use
> AUTH_SYS security flavor. Actually this is specified in "16. Security
> Considerations" from RFC 3530 and AUTH_SYS in NFSv4 is called <<
> "classic"
> model of machine authentication via IP address checking >>.
> 
> What do you think about the following idea about configuration?
> 
> 1. The NFS server for NFSv2/3 clients allows to specify whether their
> MOUNT MNT, UMNT and UMNTALL RPC requests have to or do not have to
> come
> from reserved ports.
> 
> 2. The NFS server for NFSv2/3/4 clients allows to specify whether
> their
> NFS RPC calls:
> a) do not have to come from reserved ports
> b) always have to come from reserved ports
> c) have to come from reserved ports if clients use AUTH_SYS.
> 
> 3. By default reserved ports are not required for MOUNT RPC and
> NFS RPC calls. Corresponding options can be used for entire file
> system and/or for single address specification.
> 
> First item obviously is checked in a user space and second item is
> checked
> in the NFS server somewhere after VFS_CHECKEXP() when the server
> decides
> which security flavor to use.
> 
One problem with this is that some NFSv4 operations do not have any
file handle and, as such, cannot be associated with any exported file
system. (I suppose you could add the export option for resvport to
the V4: line like I did with "-sec" for these operations, but it will
get messy.)

> NetBSD already has -noresvmnt and -noresvport options in their
> exports(5).
> 
I'll let others comment w.r.t. whether they have a need for this. To me,
unless others are saying "we need this", I don't see any reason to change
what is already there, except maybe optionally require a reserved port#
for NFSv4 mounts via a sysctl. I comment on this further down.

> > Personally, I agree with the working group and have always thought
> > requiring
> > a client to use a "reserved port#" was meaningless. However, I
> > already noted
> > that I don't mind enabling it, with a comment that it should not be
> > required
> > for NFSv4.
> 
> If a client machine is trusted, then reserved ports can guaranty that
> requests come from privileged processes and not from user space where
> client can fill any credentials in AUTH_SYS. If client machine is not
> trusted, then this will not work of course. BTW mountd requires
> reserved
> port and NFS server does not required reserved port by default.
Well, I agree that, if you have a client machine where "root" is secure
(no root kit vunerabilities, etc) but non-root users on this machine
would potentially run their own bogus userland NFS client, then requiring
a reserved port# does subvert the use of such a bogus NFS client.
(My concern is that some people will think that requiring a reserved port#
 makes NFS secure for other cases, like users with their own laptops/desktops.)

Personally, I think the above case is rare and that having another sysctl
vfs.nfsd.nfsv4_privport (similar to vfs.nfsd.nfs_privport) is sufficient,
but I'll let others comment on this, since it is not my decision.

rick

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 02:30:57 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id C192C106566C;
	Sat, 12 May 2012 02:30:57 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 629888FC0A;
	Sat, 12 May 2012 02:30:57 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap4EAPzKrU+DaFvO/2dsb2JhbABEhXmufoIOBwEBBAEjVgUWDgoRGQIEVQYThUkHgjkFqEySSIsXFIRcgRgEjneHBpBAgwWBOwg
X-IronPort-AV: E=Sophos;i="4.75,574,1330923600"; d="scan'208";a="171638749"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 11 May 2012 22:30:51 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 22663B3F86;
	Fri, 11 May 2012 22:30:51 -0400 (EDT)
Date: Fri, 11 May 2012 22:30:51 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Andrew Leonard <lists@hurricane-ridge.com>
Message-ID: <1831201709.296992.1336789851115.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <CADUQDp-QHqXtRtTQfm4y7sEZhZeesR0=WBiUWP39XUzr92gUXg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_296991_491013469.1336789851113"
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: Unable to set ACLs on ZFS file system over NFSv4?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 02:30:57 -0000

------=_Part_296991_491013469.1336789851113
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Andrew Leonard wrote:
> On Thu, May 10, 2012 at 2:23 PM, Rick Macklem <rmacklem@uoguelph.ca>
> wrote:
> 
> > I wrote:
> 
> >> If you capture a packet trace from before you do the NFSv4 mount, I
> >> can
> >> take a look and see what the server is saying. (Basically, at mount
> >> time
> >> a reply to a Getattr should including the supported attributes and
> >> that
> >> should include the ACL bit. Then the setfacl becomes a Setattr of
> >> the
> >> ACL
> >> attribute.)
> >> # tcpdump -s 0 -w acl.pcap host <server>
> >> - run on the client should do it
> >>
> >> If you want to look at it, use wireshark. If you want me to look,
> >> just
> >> email acl.pcap as an attachment.
> >>
> >> rick
> >> ps: Although I suspect it is the server that isn't behaving, please
> >> use
> >> the FreeBSD client for the above.
> >> pss: I've cc'd trasz@ in case he can spot some reason why it
> >> wouldn't
> >> work.
> >>
> > Oh, and make sure "user1" isn't in more than 16 groups, because that
> > is the
> > limit for AUTH_SYS. (I'm not sure what the effect of user1 being in
> > more
> > than 16 groups would be, but might as well eliminate it as a cause.)
> 
> Thanks, Rick - I'll send the pcap over private email, as I'm sure
> $DAYJOB would consider it somewhat sensitive.
> 
> Looking in wireshark, if I'm reading it correctly, I don't see
> anything for FATTR4_ACL in any replies. On the final connection, I do
> see NFS4ERR_IO set as the status for the reply to the setattr - but
> from Googling, my understanding is that response is supposed to
> indicate a hard error, such as a hardware problem.
> 
Yep, it appears that ZFS returned an error that isn't in the list of
replies for getattr, so it got mapped to EIO (the catch all for error
codes not known to NFS).

I took a quick look at the ZFS code and the problem looks pretty
obvious. ZFS replies EOPNOTSUPP to the VOP_ACLCHECK() and that's
as far as it gets.

Please try the attached patch in the server (untested, but all it does is go ahead
and try the VOP_SETACL() for the case where VOP_ACLCHECK() replies
EOPNOTSUPP) and let me know if it helps.

Thanks for reporting this and sending the packet trace, rick

> Also, I have verified that "user1" is not a member of more than 16
> groups, so we can rule that out - that user is in only three groups.
> 
> -Andy

------=_Part_296991_491013469.1336789851113
Content-Type: text/x-patch; name=zfs-acl.patch
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=zfs-acl.patch

LS0tIGZzL25mcy9uZnNfY29tbW9uYWNsLmMub3JpZwkyMDEyLTA1LTExIDIyOjE5OjMyLjAwMDAw
MDAwMCAtMDQwMAorKysgZnMvbmZzL25mc19jb21tb25hY2wuYwkyMDEyLTA1LTExIDIyOjIwOjA5
LjAwMDAwMDAwMCAtMDQwMApAQCAtNDY5LDcgKzQ2OSw3IEBAIG5mc3J2X3NldGFjbCh2bm9kZV90
IHZwLCBORlNBQ0xfVCAqYWNscCwKIAkJZ290byBvdXQ7CiAJfQogCWVycm9yID0gVk9QX0FDTENI
RUNLKHZwLCBBQ0xfVFlQRV9ORlM0LCBhY2xwLCBjcmVkLCBwKTsKLQlpZiAoIWVycm9yKQorCWlm
IChlcnJvciA9PSAwIHx8IGVycm9yID09IEVPUE5PVFNVUFApCiAJCWVycm9yID0gVk9QX1NFVEFD
TCh2cCwgQUNMX1RZUEVfTkZTNCwgYWNscCwgY3JlZCwgcCk7CiAKIG91dDoK
------=_Part_296991_491013469.1336789851113--

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 09:22:35 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AA1AD106566C
	for <freebsd-fs@FreeBSD.org>; Sat, 12 May 2012 09:22:35 +0000 (UTC)
	(envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id EB4EA8FC08
	for <freebsd-fs@FreeBSD.org>; Sat, 12 May 2012 09:22:34 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA29703;
	Sat, 12 May 2012 12:22:26 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
	by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1ST8Ww-00099F-If; Sat, 12 May 2012 12:22:26 +0300
Message-ID: <4FAE2BD1.9060002@FreeBSD.org>
Date: Sat, 12 May 2012 12:22:25 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
	rv:12.0) Gecko/20120503 Thunderbird/12.0.1
MIME-Version: 1.0
To: Florian Wagner <florian@wagner-flo.net>
References: <20111015214347.09f68e4e@naclador.mos32.de>
	<4E9ACA9F.5090308@FreeBSD.org>
	<20111019082139.1661868e@auedv3.syscomp.de>
	<4E9EEF45.9020404@FreeBSD.org>
	<20111019182130.27446750@naclador.mos32.de>
	<4EB98E05.4070900@FreeBSD.org>
	<20111119211921.7ffa9953@naclador.mos32.de>
	<4EC8CD14.4040600@FreeBSD.org>
	<20111120121248.5e9773c8@naclador.mos32.de>
	<4EC91B36.7060107@FreeBSD.org>
	<20111120191018.1aa4e882@naclador.mos32.de>
	<4ECA2DBD.5040701@FreeBSD.org>
	<20111121201332.03ecadf1@naclador.mos32.de>
	<4ECAC272.5080500@FreeBSD.org> <4ECEBD44.6090900@FreeBSD.org>
	<20111125224722.6cf3a299@naclador.mos32.de>
	<4ED0CFF9.4030503@FreeBSD.org>
	<20111126134927.60fe5097@naclador.mos32.de>
	<4ED35326.80402@FreeBSD.org>
	<20120109122011.0ae6ad70@naclador.mos32.de>
In-Reply-To: <20120109122011.0ae6ad70@naclador.mos32.de>
X-Enigmail-Version: 1.5pre
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Extending zfsboot.c to allow selecting filesystem from
	boot.config
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 09:22:35 -0000

on 09/01/2012 13:20 Florian Wagner said the following:
> 
> Do you currently have any plans to merge any of that into stable-9 or 
> stable-8?

I have just committed the code to head.
MFC timer is set to 1 month.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 10:06:18 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 979C3106566C
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 10:06:18 +0000 (UTC)
	(envelope-from ronald-freebsd8@klop.yi.org)
Received: from cpsmtpb-ews09.kpnxchange.com (cpsmtpb-ews09.kpnxchange.com
	[213.75.39.14]) by mx1.freebsd.org (Postfix) with ESMTP id 229218FC08
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 10:06:17 +0000 (UTC)
Received: from cpsps-ews08.kpnxchange.com ([10.94.84.175]) by
	cpsmtpb-ews09.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.4675); 
	Sat, 12 May 2012 12:05:10 +0200
Received: from CPSMTPM-TLF103.kpnxchange.com ([195.121.3.6]) by
	cpsps-ews08.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); 
	Sat, 12 May 2012 12:05:10 +0200
Received: from sjakie.klop.ws ([212.182.167.131]) by
	CPSMTPM-TLF103.kpnxchange.com with Microsoft
	SMTPSVC(7.5.7601.17514); Sat, 12 May 2012 12:05:09 +0200
Received: from 212-182-167-131.ip.telfort.nl (localhost [127.0.0.1])
	by sjakie.klop.ws (Postfix) with ESMTP id 9C2191157A
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 12:05:09 +0200 (CEST)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
References: <4FACCAEB.8040401@ibl.fr> <20120511155153.GA31698@in-addr.com>
Date: Sat, 12 May 2012 12:05:09 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.wd6wyv1f8527sy@212-182-167-131.ip.telfort.nl>
In-Reply-To: <20120511155153.GA31698@in-addr.com>
User-Agent: Opera Mail/11.62 (FreeBSD)
X-OriginalArrivalTime: 12 May 2012 10:05:09.0797 (UTC)
	FILETIME=[B6D5D550:01CD3026]
X-RcptDomain: freebsd.org
Subject: Re: Best practice for shared volume with iscsi Dell MD3200i ?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 10:06:18 -0000

On Fri, 11 May 2012 17:51:53 +0200, Gary Palmer <gpalmer@freebsd.org>  
wrote:

> On Fri, May 11, 2012 at 10:16:43AM +0200, Karl Oulmi wrote:
>> Hi all,
>>
>> I am trying to run two freebsd9 boxes with a 3.7 TO shared iscsi volume
>> on a MD3200i.
>>
>> The goal is to run a "master" and a "slave" dovecot IMAP server with a
>> shared /home.
>>
>> I created the shared partition like this :
>> gpart create -s gpt /dev/da0
>> gpart add -t freebsd-ufs /dev/da0
>> newfs /dev/da0p1
>>
>> Everything is working great on the "master" server, but when I'm trying
>> to mount the volume from the "slave" one, I have the following error :
>> mount: /dev/da0p1 : Operation not permitted
>>
>> The only way I have to successfully mount the share on the "slave"
>> server is to run a fsck -t ufs /dev/da0p1 and then do the mount.
>>
>> Could anyone tell me what's wrong ?
>
> UFS is not a cluster-aware filesystem.  You cannot mount it in
> multiple places at the same time.  The best you can hope for in
> that situation, short of developing a cluster-aware filesystem, is
> to only mount the volume on the slave if the master fails.
>
> Regards,
>
> Gary

Or use NFS.

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 10:08:28 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CE434106564A
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 10:08:28 +0000 (UTC)
	(envelope-from ronald-freebsd8@klop.yi.org)
Received: from cpsmtpb-ews02.kpnxchange.com (cpsmtpb-ews02.kpnxchange.com
	[213.75.39.5]) by mx1.freebsd.org (Postfix) with ESMTP id 597AD8FC0C
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 10:08:28 +0000 (UTC)
Received: from cpsps-ews12.kpnxchange.com ([10.94.84.179]) by
	cpsmtpb-ews02.kpnxchange.com with Microsoft SMTPSVC(6.0.3790.4675); 
	Sat, 12 May 2012 12:07:21 +0200
Received: from CPSMTPM-TLF103.kpnxchange.com ([195.121.3.6]) by
	cpsps-ews12.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); 
	Sat, 12 May 2012 12:07:21 +0200
Received: from sjakie.klop.ws ([212.182.167.131]) by
	CPSMTPM-TLF103.kpnxchange.com with Microsoft
	SMTPSVC(7.5.7601.17514); Sat, 12 May 2012 12:07:20 +0200
Received: from 212-182-167-131.ip.telfort.nl (localhost [127.0.0.1])
	by sjakie.klop.ws (Postfix) with ESMTP id AB4371157F
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 12:07:20 +0200 (CEST)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
To: freebsd-fs@freebsd.org
References: <201205112150.q4BLoHUD097623@freefall.freebsd.org>
Date: Sat, 12 May 2012 12:07:20 +0200
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
From: "Ronald Klop" <ronald-freebsd8@klop.yi.org>
Message-ID: <op.wd6w2ib88527sy@212-182-167-131.ip.telfort.nl>
In-Reply-To: <201205112150.q4BLoHUD097623@freefall.freebsd.org>
User-Agent: Opera Mail/11.62 (FreeBSD)
X-OriginalArrivalTime: 12 May 2012 10:07:20.0751 (UTC)
	FILETIME=[04E3D3F0:01CD3027]
X-RcptDomain: freebsd.org
Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown / reboot
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 10:08:28 -0000

On Fri, 11 May 2012 23:50:17 +0200, Jeff Kletsky <freebsd@wagsky.com>  
wrote:

> The following reply was made to PR kern/167685; it has been noted by  
> GNATS.
>
> From: Jeff Kletsky <freebsd@wagsky.com>
> To: bug-followup@FreeBSD.org
> Cc:
> Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown /  
> reboot
> Date: Fri, 11 May 2012 14:41:03 -0700
>
>  This is a multi-part message in MIME format.
>  --------------020209050805030409070009
>  Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>  Content-Transfer-Encoding: 7bit
> Problem can be replicated by booting of a "memstick" (with a "spare" USB
>  stick as /dev/da1) and then executing
> # dd if=/dev/zer of=/dev/da1 bs=64k
>  # zpool create stick /dev/da1
>  # reboot
> Problem has been reliably reproduced on the Atom 330 previously
>  mentioned, as well as on an AMD A8-3870 with A75 chipset. It also can be
>  replicated using VirtualBox running under Ubuntu on the AMD A8-3870
>  system. It does not seem specific to one "flavor" of USB controller or
>  driver.
> Using /usr/src/release/generate_release.sh and bisection, I have
>  confirmed that
> * r227445 does not exhibit the behavior ("Copy stable/9 to releng/9.0 as
>  part of the FreeBSD 9.0-RELEASE release cycle)
>  * r229097 does not exhibit the behavior
>  * r229281 -- FAIL by not rebooting under the conditions described above.
> Based on these results, I am suspicious of
> r229100 | hselasky | 2011-12-31 06:33:15 -0800 (Sat, 31 Dec 2011) | 6  
> lines
> MFC r228709, r228711 and r228723:
>  - Add missing unlock of USB controller's lock, when
>  doing shutdown, suspend and resume.
>  - Add code to wait for USB shutdown to be executed at system shutdown.
>  - Add sysctl which can be used to skip this waiting.
> as being what brought the issue to the forefront.
> I am presently building r229099 and r229100 to confirm this suspicion.


> A potential, though untested workaround would be
>  # sysctl hw.usb.no_shutdown_wait=1

I had/have the same problem with ZFS on my external USB backup-disk. I use  
that sysctl since and can confirm that it works.

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 11:18:48 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CCEC1106566B
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 11:18:48 +0000 (UTC)
	(envelope-from araujobsdport@gmail.com)
Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 8E7128FC08
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 11:18:48 +0000 (UTC)
Received: by obcni5 with SMTP id ni5so6022772obc.13
	for <freebsd-fs@freebsd.org>; Sat, 12 May 2012 04:18:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:reply-to:in-reply-to:references:date:message-id
	:subject:from:to:cc:content-type;
	bh=TasOHjdSopeb8mAf5CEQ1u22xi+5MlW/ZSKeaLVmJVQ=;
	b=s0boO1FQbR6SZMcybhXSdIZJwaOTW4Acz4a5IzeacePsY5Rs6dpRdSeNG6ctHIaWna
	C5BqKffpNi1RCHVkyn81dWEClEJ6MOmmzmf8nU574kf4GGW1xWXiIhQZ6R+rqLY+EVWd
	x/laNtzwfYcQeh/PRB7piXEH4drifHmPHucPCB8VhHAXJqow0Mg1n4tn/mKTxLzyb+ha
	RcHltylSHmda5EQW8JKadSzeKPmdyES5IdaA59ifHVL9fD3S7gZm0CT486gisd0vTenH
	1HeV/H1akE1AvjP5Khrvn9WtUFn5Z7ItaD4xYgrVH+VMtvhFEPkFby/Si4tIr2SxRg7t
	h9vw==
MIME-Version: 1.0
Received: by 10.50.191.233 with SMTP id hb9mr674988igc.44.1336821528129; Sat,
	12 May 2012 04:18:48 -0700 (PDT)
Received: by 10.231.31.196 with HTTP; Sat, 12 May 2012 04:18:48 -0700 (PDT)
In-Reply-To: <op.wd6wyv1f8527sy@212-182-167-131.ip.telfort.nl>
References: <4FACCAEB.8040401@ibl.fr> <20120511155153.GA31698@in-addr.com>
	<op.wd6wyv1f8527sy@212-182-167-131.ip.telfort.nl>
Date: Sat, 12 May 2012 19:18:48 +0800
Message-ID: <CAOfEmZjZp5u_V8Cc7yXZX_dy=jq3T5oq5g-w4rDeXyq-8yOZZw@mail.gmail.com>
From: Marcelo Araujo <araujobsdport@gmail.com>
To: Ronald Klop <ronald-freebsd8@klop.yi.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org
Subject: Re: Best practice for shared volume with iscsi Dell MD3200i ?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: araujo@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 11:18:48 -0000

2012/5/12 Ronald Klop <ronald-freebsd8@klop.yi.org>

> On Fri, 11 May 2012 17:51:53 +0200, Gary Palmer <gpalmer@freebsd.org>
> wrote:
>
>  On Fri, May 11, 2012 at 10:16:43AM +0200, Karl Oulmi wrote:
>>
>>> Hi all,
>>>
>>> I am trying to run two freebsd9 boxes with a 3.7 TO shared iscsi volume
>>> on a MD3200i.
>>>
>>> The goal is to run a "master" and a "slave" dovecot IMAP server with a
>>> shared /home.
>>>
>>> I created the shared partition like this :
>>> gpart create -s gpt /dev/da0
>>> gpart add -t freebsd-ufs /dev/da0
>>> newfs /dev/da0p1
>>>
>>> Everything is working great on the "master" server, but when I'm trying
>>> to mount the volume from the "slave" one, I have the following error :
>>> mount: /dev/da0p1 : Operation not permitted
>>>
>>> The only way I have to successfully mount the share on the "slave"
>>> server is to run a fsck -t ufs /dev/da0p1 and then do the mount.
>>>
>>> Could anyone tell me what's wrong ?
>>>
>>
>> UFS is not a cluster-aware filesystem.  You cannot mount it in
>> multiple places at the same time.  The best you can hope for in
>> that situation, short of developing a cluster-aware filesystem, is
>> to only mount the volume on the slave if the master fails.
>>
>> Regards,
>>
>> Gary
>>
>
>

Just some questions!
Both machines share access to the same DISKS? I mean, both machines can see
all disks?
If yes, you could use DEVD to detect some kind of fail like CARP or
something else and than, do some action like mount the disks on slave and
so on.

Currently I have this solution and works pretty well.

Best Regards,
-- 
Marcelo Araujo
araujo@FreeBSD.org

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 12:10:12 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 9D21F106564A
	for <freebsd-fs@hub.freebsd.org>; Sat, 12 May 2012 12:10:12 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 875498FC17
	for <freebsd-fs@hub.freebsd.org>; Sat, 12 May 2012 12:10:12 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4CCACpf043067
	for <freebsd-fs@freefall.freebsd.org>; Sat, 12 May 2012 12:10:12 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4CCACtL043066;
	Sat, 12 May 2012 12:10:12 GMT (envelope-from gnats)
Date: Sat, 12 May 2012 12:10:12 GMT
Message-Id: <201205121210.q4CCACtL043066@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: dfilter@FreeBSD.ORG (dfilter service)
Cc: 
Subject: Re: kern/165923: commit references a PR
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 12:10:12 -0000

The following reply was made to PR kern/165923; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/165923: commit references a PR
Date: Sat, 12 May 2012 12:03:08 +0000 (UTC)

 Author: rmacklem
 Date: Sat May 12 12:02:51 2012
 New Revision: 235332
 URL: http://svn.freebsd.org/changeset/base/235332
 
 Log:
   PR# 165923 reported intermittent write failures for dirty
   memory mapped pages being written back on an NFS mount.
   Since any thread can call VOP_PUTPAGES() to write back a
   dirty page, the credentials of that thread may not have
   write access to the file on an NFS server. (Often the uid
   is 0, which may be mapped to "nobody" in the NFS server.)
   Although there is no completely correct fix for this
   (NFS servers check access on every write RPC instead of at
   open/mmap time), this patch avoids the common cases by
   holding onto a credential that recently opened the file
   for writing and uses that credential for the write RPCs
   being done by VOP_PUTPAGES() for both NFS clients.
   
   Tested by:	Joel Ray Holveck (joelh at juniper.net)
   PR:		kern/165923
   Reviewed by:	kib
   MFC after:	2 weeks
 
 Modified:
   head/sys/fs/nfsclient/nfs_clbio.c
   head/sys/fs/nfsclient/nfs_clnode.c
   head/sys/fs/nfsclient/nfs_clvnops.c
   head/sys/fs/nfsclient/nfsnode.h
   head/sys/nfsclient/nfs_bio.c
   head/sys/nfsclient/nfs_node.c
   head/sys/nfsclient/nfs_vnops.c
   head/sys/nfsclient/nfsnode.h
 
 Modified: head/sys/fs/nfsclient/nfs_clbio.c
 ==============================================================================
 --- head/sys/fs/nfsclient/nfs_clbio.c	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/fs/nfsclient/nfs_clbio.c	Sat May 12 12:02:51 2012	(r235332)
 @@ -281,7 +281,11 @@ ncl_putpages(struct vop_putpages_args *a
  	vp = ap->a_vp;
  	np = VTONFS(vp);
  	td = curthread;				/* XXX */
 -	cred = curthread->td_ucred;		/* XXX */
 +	/* Set the cred to n_writecred for the write rpcs. */
 +	if (np->n_writecred != NULL)
 +		cred = crhold(np->n_writecred);
 +	else
 +		cred = crhold(curthread->td_ucred);	/* XXX */
  	nmp = VFSTONFS(vp->v_mount);
  	pages = ap->a_m;
  	count = ap->a_count;
 @@ -345,6 +349,7 @@ ncl_putpages(struct vop_putpages_args *a
  	    iomode = NFSWRITE_FILESYNC;
  
  	error = ncl_writerpc(vp, &uio, cred, &iomode, &must_commit, 0);
 +	crfree(cred);
  
  	pmap_qremove(kva, npages);
  	relpbuf(bp, &ncl_pbuf_freecnt);
 
 Modified: head/sys/fs/nfsclient/nfs_clnode.c
 ==============================================================================
 --- head/sys/fs/nfsclient/nfs_clnode.c	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/fs/nfsclient/nfs_clnode.c	Sat May 12 12:02:51 2012	(r235332)
 @@ -300,6 +300,8 @@ ncl_reclaim(struct vop_reclaim_args *ap)
  			FREE((caddr_t)dp2, M_NFSDIROFF);
  		}
  	}
 +	if (np->n_writecred != NULL)
 +		crfree(np->n_writecred);
  	FREE((caddr_t)np->n_fhp, M_NFSFH);
  	if (np->n_v4 != NULL)
  		FREE((caddr_t)np->n_v4, M_NFSV4NODE);
 
 Modified: head/sys/fs/nfsclient/nfs_clvnops.c
 ==============================================================================
 --- head/sys/fs/nfsclient/nfs_clvnops.c	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/fs/nfsclient/nfs_clvnops.c	Sat May 12 12:02:51 2012	(r235332)
 @@ -513,6 +513,7 @@ nfs_open(struct vop_open_args *ap)
  	struct vattr vattr;
  	int error;
  	int fmode = ap->a_mode;
 +	struct ucred *cred;
  
  	if (vp->v_type != VREG && vp->v_type != VDIR && vp->v_type != VLNK)
  		return (EOPNOTSUPP);
 @@ -604,7 +605,22 @@ nfs_open(struct vop_open_args *ap)
  		}
  		np->n_directio_opens++;
  	}
 +
 +	/*
 +	 * If this is an open for writing, capture a reference to the
 +	 * credentials, so they can be used by ncl_putpages(). Using
 +	 * these write credentials is preferable to the credentials of
 +	 * whatever thread happens to be doing the VOP_PUTPAGES() since
 +	 * the write RPCs are less likely to fail with EACCES.
 +	 */
 +	if ((fmode & FWRITE) != 0) {
 +		cred = np->n_writecred;
 +		np->n_writecred = crhold(ap->a_cred);
 +	} else
 +		cred = NULL;
  	mtx_unlock(&np->n_mtx);
 +	if (cred != NULL)
 +		crfree(cred);
  	vnode_create_vobject(vp, vattr.va_size, ap->a_td);
  	return (0);
  }
 
 Modified: head/sys/fs/nfsclient/nfsnode.h
 ==============================================================================
 --- head/sys/fs/nfsclient/nfsnode.h	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/fs/nfsclient/nfsnode.h	Sat May 12 12:02:51 2012	(r235332)
 @@ -123,6 +123,7 @@ struct nfsnode {
  	int                     n_directio_asyncwr;
  	u_int64_t		 n_change;	/* old Change attribute */
  	struct nfsv4node	*n_v4;		/* extra V4 stuff */
 +	struct ucred		*n_writecred;	/* Cred. for putpages */
  };
  
  #define	n_atim		n_un1.nf_atim
 
 Modified: head/sys/nfsclient/nfs_bio.c
 ==============================================================================
 --- head/sys/nfsclient/nfs_bio.c	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/nfsclient/nfs_bio.c	Sat May 12 12:02:51 2012	(r235332)
 @@ -275,7 +275,11 @@ nfs_putpages(struct vop_putpages_args *a
  	vp = ap->a_vp;
  	np = VTONFS(vp);
  	td = curthread;				/* XXX */
 -	cred = curthread->td_ucred;		/* XXX */
 +	/* Set the cred to n_writecred for the write rpcs. */
 +	if (np->n_writecred != NULL)
 +		cred = crhold(np->n_writecred);
 +	else
 +		cred = crhold(curthread->td_ucred);	/* XXX */
  	nmp = VFSTONFS(vp->v_mount);
  	pages = ap->a_m;
  	count = ap->a_count;
 @@ -339,6 +343,7 @@ nfs_putpages(struct vop_putpages_args *a
  	    iomode = NFSV3WRITE_FILESYNC;
  
  	error = (nmp->nm_rpcops->nr_writerpc)(vp, &uio, cred, &iomode, &must_commit);
 +	crfree(cred);
  
  	pmap_qremove(kva, npages);
  	relpbuf(bp, &nfs_pbuf_freecnt);
 
 Modified: head/sys/nfsclient/nfs_node.c
 ==============================================================================
 --- head/sys/nfsclient/nfs_node.c	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/nfsclient/nfs_node.c	Sat May 12 12:02:51 2012	(r235332)
 @@ -270,6 +270,8 @@ nfs_reclaim(struct vop_reclaim_args *ap)
  			free((caddr_t)dp2, M_NFSDIROFF);
  		}
  	}
 +	if (np->n_writecred != NULL)
 +		crfree(np->n_writecred);
  	if (np->n_fhsize > NFS_SMALLFH) {
  		free((caddr_t)np->n_fhp, M_NFSBIGFH);
  	}
 
 Modified: head/sys/nfsclient/nfs_vnops.c
 ==============================================================================
 --- head/sys/nfsclient/nfs_vnops.c	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/nfsclient/nfs_vnops.c	Sat May 12 12:02:51 2012	(r235332)
 @@ -507,6 +507,7 @@ nfs_open(struct vop_open_args *ap)
  	struct vattr vattr;
  	int error;
  	int fmode = ap->a_mode;
 +	struct ucred *cred;
  
  	if (vp->v_type != VREG && vp->v_type != VDIR && vp->v_type != VLNK)
  		return (EOPNOTSUPP);
 @@ -563,7 +564,22 @@ nfs_open(struct vop_open_args *ap)
  		}
  		np->n_directio_opens++;
  	}
 +
 +	/*
 +	 * If this is an open for writing, capture a reference to the
 +	 * credentials, so they can be used by nfs_putpages(). Using
 +	 * these write credentials is preferable to the credentials of
 +	 * whatever thread happens to be doing the VOP_PUTPAGES() since
 +	 * the write RPCs are less likely to fail with EACCES.
 +	 */
 +	if ((fmode & FWRITE) != 0) {
 +		cred = np->n_writecred;
 +		np->n_writecred = crhold(ap->a_cred);
 +	} else
 +		cred = NULL;
  	mtx_unlock(&np->n_mtx);
 +	if (cred != NULL)
 +		crfree(cred);
  	vnode_create_vobject(vp, vattr.va_size, ap->a_td);
  	return (0);
  }
 
 Modified: head/sys/nfsclient/nfsnode.h
 ==============================================================================
 --- head/sys/nfsclient/nfsnode.h	Sat May 12 10:53:49 2012	(r235331)
 +++ head/sys/nfsclient/nfsnode.h	Sat May 12 12:02:51 2012	(r235332)
 @@ -128,6 +128,7 @@ struct nfsnode {
  	uint32_t		n_namelen;
  	int			n_directio_opens;
  	int                     n_directio_asyncwr;
 +	struct ucred		*n_writecred;	/* Cred. for putpages */
  };
  
  #define n_atim		n_un1.nf_atim
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-fs@FreeBSD.ORG  Sat May 12 17:40:13 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0F477106564A
	for <freebsd-fs@hub.freebsd.org>; Sat, 12 May 2012 17:40:13 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id D4AEA8FC0A
	for <freebsd-fs@hub.freebsd.org>; Sat, 12 May 2012 17:40:12 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4CHeC4b083501
	for <freebsd-fs@freefall.freebsd.org>; Sat, 12 May 2012 17:40:12 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4CHeCRD083500;
	Sat, 12 May 2012 17:40:12 GMT (envelope-from gnats)
Date: Sat, 12 May 2012 17:40:12 GMT
Message-Id: <201205121740.q4CHeCRD083500@freefall.freebsd.org>
To: freebsd-fs@FreeBSD.org
From: Jeff Kletsky <freebsd@wagsky.com>
Cc: 
Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown / reboot
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Jeff Kletsky <freebsd@wagsky.com>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 12 May 2012 17:40:13 -0000

The following reply was made to PR kern/167685; it has been noted by GNATS.

From: Jeff Kletsky <freebsd@wagsky.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/167685: [zfs] ZFS on USB drive prevents shutdown / reboot
Date: Sat, 12 May 2012 10:30:26 -0700

 Not surprisingly:
 
 r229099 does *not* exhibit the symptom
 r229100 *does* exhibit the symptom
 
 # sysctl hw.usb.no_shutdown_wait=1
 
 is confirmed as a workaround