From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  7 01:46:28 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3FC6A16A41F
	for <freebsd-fs@freebsd.org>; Mon,  7 Nov 2005 01:46:28 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from mh2.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BC8FF43D45
	for <freebsd-fs@freebsd.org>; Mon,  7 Nov 2005 01:46:27 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [192.168.42.23] (andersonbox3.centtech.com [192.168.42.23])
	by mh2.centtech.com (8.13.1/8.13.1) with ESMTP id jA71kN5U011429;
	Sun, 6 Nov 2005 19:46:24 -0600 (CST)
	(envelope-from anderson@centtech.com)
Message-ID: <436EB1EC.7010801@centtech.com>
Date: Sun, 06 Nov 2005 19:46:20 -0600
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20051021
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: user <user@dhp.com>
References: <Pine.LNX.4.21.0511041531210.8180-100000@shell.dhp.com>
	<436BDB99.5060907@samsco.org>
In-Reply-To: <436BDB99.5060907@samsco.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.82/1165/Sat Nov 5 23:12:58 2005 on mh2.centtech.com
X-Virus-Status: Clean
Cc: freebsd-fs@freebsd.org
Subject: Re: UFS2 snapshots on large filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Nov 2005 01:46:28 -0000

Scott Long wrote:
> user wrote:
> 
>> Hello,
>>
>> Considering a PC server running FreeBSD with 4 400 GB hard drives 
>> attached
>> to a hardware raid controller doing raid-5.
>>
>> So this will present itself to the OS as a 1.2TB filesystem.
>>
>> Any comments on taking one or multiple snapshots of a filesystem of this
>> size ?
>>
>> Given current disk capacities, I would not exactly consider this 1.2TB
>> filesystem a "large" one ... any comments on say ... a 6-8 TB filesystem
>> and making one or more snapshots of it ?
>>
>> Assume they are marginally busy - perhaps a 5-10% data turnover per 
>> day...
>>
>> Thanks.
>>
> 
> The UFS snapshot code was written at a time when disks were typically 
> around 4-9GB in size, not 400GB in size =-)  Unfortunately, the amount
> of time it takes to do the initial snapshot bookkeeping scales linearly
> with the size of the drive, and many people have reported that it takes
> considerable amount of time (anywhere from several minutes to several 
> dozen minutes) on large drives/arrays like you describe.  So, you should
> test and plan accordingly if you are interested in using them.

I have several 2TB filesystems, which I do snapshots on.  I can report 
that it indeed takes a long time to run, but works nevertheless.  One 
thing to keep in mind though - fsck'ing a 2TB filesystem can take 2GB of 
memory (depends on how many files you have, and a few other factors).  I 
have 4GB of memory in this box, and what I've seen so far is between 
1.5GB and 2.5GB of memory required for fsck to run smoothly on a single 
partition.  You definitely also want to turn off background fsck, unless 
you have extreme amounts of time on your hands.  An fsck of a 70% full 
2TB filesystem with a ton of files on it takes many hours, so I often 
mount the filesystems unclean (FreeBSD lets me do this) with rw, and 
continue my work until I can unmount, fsck, and remount the fs.

This brings me to UFS Journaling - Scott, how's it coming along?  I know 
you've been busy with the 6.0-RELEASE (great work by the way!!), but I'm 
itching for this.  Is there anything I can do to help?

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  7 05:09:18 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2023016A41F
	for <freebsd-fs@freebsd.org>; Mon,  7 Nov 2005 05:09:18 +0000 (GMT)
	(envelope-from user@dhp.com)
Received: from shell.dhp.com (shell.dhp.com [199.245.105.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5117943D45
	for <freebsd-fs@freebsd.org>; Mon,  7 Nov 2005 05:09:17 +0000 (GMT)
	(envelope-from user@dhp.com)
Received: by shell.dhp.com (Postfix, from userid 896)
	id 789EB3134D; Mon,  7 Nov 2005 00:09:15 -0500 (EST)
Date: Mon, 7 Nov 2005 00:09:15 -0500 (EST)
From: user <user@dhp.com>
To: Eric Anderson <anderson@centtech.com>
In-Reply-To: <436EB1EC.7010801@centtech.com>
Message-ID: <Pine.LNX.4.21.0511070002410.8180-100000@shell.dhp.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Cc: freebsd-fs@freebsd.org
Subject: Re: UFS2 snapshots on large filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Nov 2005 05:09:18 -0000


Eric,

Thanks a bunch for your comments.  See below:

On Sun, 6 Nov 2005, Eric Anderson wrote:

> I have several 2TB filesystems, which I do snapshots on.  I can report 
> that it indeed takes a long time to run, but works nevertheless.  One 
> thing to keep in mind though - fsck'ing a 2TB filesystem can take 2GB of 
> memory (depends on how many files you have, and a few other factors).  I 
> have 4GB of memory in this box, and what I've seen so far is between 
> 1.5GB and 2.5GB of memory required for fsck to run smoothly on a single 
> partition.  You definitely also want to turn off background fsck, unless 
> you have extreme amounts of time on your hands.  An fsck of a 70% full 
> 2TB filesystem with a ton of files on it takes many hours, so I often 
> mount the filesystems unclean (FreeBSD lets me do this) with rw, and 
> continue my work until I can unmount, fsck, and remount the fs.


Can you elaborate ?  Namely, how long on the 2GB filesystems ?

As far as the fsck is concerned, this only happens on an ungraceful
reboot, right ?  Assuming a snapshot on a 2GB FS, and assuming no crashes,
no long-wait processes like fsck will ever occur, right ?

Any other comments ?  Do you experience instability/crashes often on
systems of this nature ?

Again, thanks.


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  7 05:24:56 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 961C216A41F
	for <freebsd-fs@freebsd.org>; Mon,  7 Nov 2005 05:24:56 +0000 (GMT)
	(envelope-from user@dhp.com)
Received: from shell.dhp.com (shell.dhp.com [199.245.105.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BBDC543D45
	for <freebsd-fs@freebsd.org>; Mon,  7 Nov 2005 05:24:55 +0000 (GMT)
	(envelope-from user@dhp.com)
Received: by shell.dhp.com (Postfix, from userid 896)
	id E83823134B; Mon,  7 Nov 2005 00:24:54 -0500 (EST)
Date: Mon, 7 Nov 2005 00:24:54 -0500 (EST)
From: user <user@dhp.com>
To: Scott Long <scottl@samsco.org>
In-Reply-To: <436BDB99.5060907@samsco.org>
Message-ID: <Pine.LNX.4.21.0511070009220.8180-100000@shell.dhp.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Cc: freebsd-fs@freebsd.org
Subject: Re: UFS2 snapshots on large filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Nov 2005 05:24:56 -0000


thank you scott - see below:

On Fri, 4 Nov 2005, Scott Long wrote:

> The UFS snapshot code was written at a time when disks were typically 
> around 4-9GB in size, not 400GB in size =-)  Unfortunately, the amount
> of time it takes to do the initial snapshot bookkeeping scales linearly
> with the size of the drive, and many people have reported that it takes
> considerable amount of time (anywhere from several minutes to several 
> dozen minutes) on large drives/arrays like you describe.  So, you should
> test and plan accordingly if you are interested in using them.


Testing is what I need to do.  I have a few follow up questions:

First, are there any sysctl or kernel tunables that change any of what you
are discussing above ?

Second, let's say I am willing to accept the long snapshot creation period
... are there other drawbacks as well during the course of _running with_
the snapshot once it is created ?  Or are all costs paid initially ?

Finally, I have read the bsdcon3 paper that mccusick wrote where he
addressed the dual problems of not enough kernel memory (10 megabytes) to
cache disk pages, and the system deadlocking that occurs with two
snapshots.  Is it true that both of the fixes he elucidated in that paper
are built into what I see as fbsd 5.4 now ?

Thanks.


From owner-freebsd-fs@FreeBSD.ORG  Mon Nov  7 13:01:07 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@FreeBSD.ORG
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E399216A41F
	for <freebsd-fs@FreeBSD.ORG>; Mon,  7 Nov 2005 13:01:07 +0000 (GMT)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (lurza.secnetix.de [83.120.8.8])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 43CC743D45
	for <freebsd-fs@FreeBSD.ORG>; Mon,  7 Nov 2005 13:01:07 +0000 (GMT)
	(envelope-from olli@lurza.secnetix.de)
Received: from lurza.secnetix.de (qvcpah@localhost [127.0.0.1])
	by lurza.secnetix.de (8.13.1/8.13.1) with ESMTP id jA7D14ar038819
	for <freebsd-fs@FreeBSD.ORG>; Mon, 7 Nov 2005 14:01:05 +0100 (CET)
	(envelope-from oliver.fromme@secnetix.de)
Received: (from olli@localhost)
	by lurza.secnetix.de (8.13.1/8.13.1/Submit) id jA7D14PT038818;
	Mon, 7 Nov 2005 14:01:04 +0100 (CET) (envelope-from olli)
Date: Mon, 7 Nov 2005 14:01:04 +0100 (CET)
Message-Id: <200511071301.jA7D14PT038818@lurza.secnetix.de>
From: Oliver Fromme <olli@lurza.secnetix.de>
To: freebsd-fs@FreeBSD.ORG
In-Reply-To: <Pine.LNX.4.21.0511070002410.8180-100000@shell.dhp.com>
X-Newsgroups: list.freebsd-fs
User-Agent: tin/1.5.4-20000523 ("1959") (UNIX) (FreeBSD/4.11-RELEASE (i386))
Cc: 
Subject: Re: UFS2 snapshots on large filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: freebsd-fs@FreeBSD.ORG
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Nov 2005 13:01:08 -0000

user <user@dhp.com> wrote:
 > On Sun, 6 Nov 2005, Eric Anderson wrote:
 > > [fsck on large file systems taking a long time]
 > 
 > Can you elaborate ?  Namely, how long on the 2GB filesystems ?

It depends very much on the file system parameters.  In
particular, it's well worth to lower the inode density
(i.e. increase the -i number argument to newfs) if you
can afford it, i.e. if you expect to have fewer large
files on the file system (such as multimedia files).

On a 250 Gbyte drive of mine, I used newfs with -i 131072.
That still leaves enough inodes for about 2 million files,
but reduces fsck time significantly.  A nice side effect
is that it gives you more free space for files, because
every inode occupies 256 bytes in UFS2.  In this case you
I got about 7 Gbyte additional space.

 > As far as the fsck is concerned, this only happens on an ungraceful
 > reboot, right ?

Right.  On a kernel panic, hardware freeze, power failure
or similar things.

 > Assuming a snapshot on a 2GB FS, and assuming no crashes,
 > no long-wait processes like fsck will ever occur, right ?

Right.  But creating the snapshot in the first place takes
a it of time, depending on the size of the file system.

 > Any other comments ?  Do you experience instability/crashes often on
 > systems of this nature ?

No.  I recommend you use 6.0-Release or RELENG_6 (6-stable).

Best regards
   Oliver

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"C is quirky, flawed, and an enormous success."
        -- Dennis M. Ritchie.

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov  9 00:49:24 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: fs@FreeBSD.org
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6122D16A41F;
	Wed,  9 Nov 2005 00:49:24 +0000 (GMT)
	(envelope-from dodell@knight.ixsystems.net)
Received: from knight.ixsystems.net (knight.ixsystems.net [206.40.55.67])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0421A43D46;
	Wed,  9 Nov 2005 00:49:23 +0000 (GMT)
	(envelope-from dodell@knight.ixsystems.net)
Received: from knight.ixsystems.net (localhost [127.0.0.1])
	by knight.ixsystems.net (8.12.10/8.11.6) with ESMTP id jA90lkwF042580; 
	Tue, 8 Nov 2005 16:47:46 -0800 (PST)
	(envelope-from dodell@knight.ixsystems.net)
Received: (from dodell@localhost)
	by knight.ixsystems.net (8.12.10/8.12.9/Submit) id jA90lkZt042579;
	Tue, 8 Nov 2005 16:47:46 -0800 (PST) (envelope-from dodell)
Date: Tue, 8 Nov 2005 16:47:46 -0800
From: "Devon H. O'Dell" <dodell@ixsystems.com>
To: fs@FreeBSD.org
Message-ID: <20051108164746.A42559@knight.ixsystems.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
Cc: freebsd-emulation@FreeBSD.org
Subject: Questions on VFS and linux emulation
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Nov 2005 00:49:24 -0000

Hey all,

Sorry for the cross-post; I've got questions that fall both into the
categories of our Linux emulation stuff and VFS stuff.

I've been working on amd64/88249 and it has brought up a few questions on
VFS and linux emulation for me. There is a good bit of background on the
issue that I posted in the PR, but to briefly summarize:

The linux emulation point for the getdent / readdir syscalls passes 
cookies up the VFS stack.

The devfs filesystem doesn't initialize the cookie pointer nor the
ncookies data.

vfs_readdir requires the caller to have set up both the cookie pointer
and ncookies data if the pointer to ncookies is non-NULL.

``boom''

However, looking deeper into the cause of this issue, I've come up with
a couple of questions on how this stuff works:

1) What is the application of cookies in this context? From what I can
tell, our native syscalls make absolutely no use of them. Is there a
reason then that they're used in the linux wrappers and not in ours?

2) How are cookies supposed to work? The only place I saw them actually
being used with a purpose, NFS, seemed to imply that it relied on the
behavior that the value of the cookie only ever increases.

3) Is UFS the only filesystem that does anything with cookies if the
caller asks for them?

With regard to both emulation and VFS:

4) I looked into the reason that the majority of the Linux file 
operation procedures didn't wrap our own syscalls like many other
emulation procedures do. It seems that our own syscalls are also
wrappers for other procedures. There are no wrappers for e.g. getdents,
e.g. the copyout() happens in the getdirentries syscall, so it cannot
be wrapped for Linux emulation. I always get the data confused that's
copied in and out, so maybe this isn't a necessity, but would it be
beneficial to modify e.g. getdirentries to wrap vfs_getdirentries so
that other emulation layers could wrap that as well?

5) Is the Linux emulation code in linux_file.c:getdents_common() doing
something really silly? For a large part, it looks like our own
vfs_syscalls.c:getdirentries() code, except I think it is doing
something incorrect with regard to cookie handling. Since I'm still sort
of confused as to how the cookies are supposed to work and I am not sure
how to set up a test in which I can call readdir() on a filesystem that
actually makes use of the cookies to see if it's doing something kooky,
I really need some outside input on this. I ask this specifically
because I do have a patch that optimizes and further abstracts our
vfs_read_dirent procedure so it can be used in other filesystems, but
it doesn't seem to work in Linux emulation.

Architecturally, I have a couple questions as well.

Discussing the issue with scottl@ and phk@, it seems it would be
preferrential for us to use vfs_read_dirent as the backend for all
filesystem_readdir operations. As previously mentioned, I have a patch
that incorporates this into devfs, and this works fine (outside of
Linux emulation, that is). Will this sort of work tread on anybody's
toes who is working in this area? (I'm not a committer, so I'm sure I
will have to work with somebody on this anyway, but if there are
already things going on in this area, I'd like to keep deltas to a
minimum). This code only touches filesystem_readdir() functions 
(filesystem_vnops.c files) and vfs_subr.c.

Hope somebody's got an idea as to what I'm talking about :). Some of
those sentences got pretty long. It's an involved discussion, so I hope
that someone with more familiarity with this can give me some tips and
perhaps review what I do.

Please keep me cc'ed to this discussion as these two lists I'm actually
not subscribed to.

Kind regards,


Devon H. O'Dell
vfs_subr.c

From owner-freebsd-fs@FreeBSD.ORG  Wed Nov  9 07:53:10 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C55AF16A41F
	for <freebsd-fs@freebsd.org>; Wed,  9 Nov 2005 07:53:10 +0000 (GMT)
	(envelope-from ezk@fsl.cs.sunysb.edu)
Received: from filer.fsl.cs.sunysb.edu (filer.fsl.cs.sunysb.edu
	[130.245.126.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 475A043D46
	for <freebsd-fs@freebsd.org>; Wed,  9 Nov 2005 07:53:10 +0000 (GMT)
	(envelope-from ezk@fsl.cs.sunysb.edu)
Received: from agora.fsl.cs.sunysb.edu (agora.fsl.cs.sunysb.edu
	[130.245.126.12])
	by filer.fsl.cs.sunysb.edu (8.12.8/8.12.8) with ESMTP id jA97r0u2028614;
	Wed, 9 Nov 2005 02:53:00 -0500
Received: from agora.fsl.cs.sunysb.edu (localhost.localdomain [127.0.0.1])
	by agora.fsl.cs.sunysb.edu (8.13.1/8.13.1) with ESMTP id jA97qxP1032461;
	Wed, 9 Nov 2005 02:52:59 -0500
Received: (from ezk@localhost)
	by agora.fsl.cs.sunysb.edu (8.13.1/8.12.8/Submit) id jA97qw9C032458;
	Wed, 9 Nov 2005 02:52:58 -0500
Date: Wed, 9 Nov 2005 02:52:58 -0500
Message-Id: <200511090752.jA97qw9C032458@agora.fsl.cs.sunysb.edu>
From: Erez Zadok <ezk@cs.sunysb.edu>
To: freebsd-fs@freebsd.org
X-MailKey: Erez_Zadok
Cc: am-utils@fsl.cs.sunysb.edu
Subject: kernel bug tickled by Amd
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Nov 2005 07:53:11 -0000

I've received this bug report on my am-utils Bugzilla.  Being a freebsd
kernel panic, I believe this is a kernel bug that should be fixed in the
kernel.  Of course, if anyone knows of a way I can workaround this in Amd,
let me know.

Thanks,
Erez.

------- Forwarded Message

Date:    Tue, 08 Nov 2005 23:57:03 -0500
From:    bugzilla@fsl.cs.sunysb.edu
To:      am-utils-developers@fsl.cs.sunysb.edu
Cc:      ezk@cs.sunysb.edu
Subject: [Bug 450] New: [panic] "unmount: dangling vnode" on amd activity

http://bugzilla.fsl.cs.sunysb.edu/show_bug.cgi?id=450

           Summary: [panic] "unmount: dangling vnode" on amd activity
           Product: am-utils
           Version: 6.1
          Platform: i386
               URL: http://www.freebsd.org/cgi/query-pr.cgi?pr=79665
        OS/Version: FreeBSD 5.x
            Status: NEW
          Severity: major
          Priority: P2
         Component: Any
        AssignedTo: am-utils-developers@am-utils.org
        ReportedBy: nate@netapp.com


Description

A very reproduceable amd panic through non privileged user-level filesystem
access.

	
How-To-Repeat

As a non-privileged user, rapidly unmount an amd managed mount with
"amq -u" while rapidly remounting that same mount (a simple file
access is sufficient.

I've reproduced this numerous times by running these two 
bourne shell scripts simultaneously on the same machine:

runrun:
#!/bin/sh
while echo hi
do
        wc -l /usr/local/build/share/oomph2
done

diedie:
#!/bin/sh
while amq -u /x/eng/btools
do
        amq /x/eng/btools
done

The machine will crash within minutes, especially if the machine
is under other stress.
- --- backtrace begins here ---
nate@pixie.lab.netapp.com:~ >sudo kgdb
/usr/src/sys/i386/compile/SMP/kernel.debug /var/crash/vmcore.5
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so:
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
#0  doadump () at pcpu.h:159
159     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) backtrace
#0  doadump () at pcpu.h:159
#1  0xc0615167 in boot (howto=260) at ../../../kern/kern_shutdown.c:410
#2  0xc061548d in panic (fmt=0xc0830da5 "unmount: dangling vnode")
    at ../../../kern/kern_shutdown.c:566
#3  0xc0664535 in vfs_mount_destroy (mp=0xc2cf9000, td=0xc26a6780)
    at ../../../kern/vfs_mount.c:522
#4  0xc0665924 in dounmount (mp=0xc2cf9000, flags=0, td=0xc26a6780)
    at ../../../kern/vfs_mount.c:1111
#5  0xc0665560 in unmount (td=0xc26a6780, uap=0xe730cd14)
    at ../../../kern/vfs_mount.c:1019
#6  0xc07c842f in syscall (frame=
      {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077941440, tf_esi =
136512416, tf_ebp = -1077941608, tf_isp = -416232076, tf_ebx = 136516288, tf_edx
= 134637440, tf_ecx = 19, tf_eax = 22, tf_trapno = 12, tf_err = 2, tf_eip =
672003099, tf_cs = 31, tf_eflags = 662, tf_esp = -1077941636, tf_ss = 47})
    at ../../../i386/i386/trap.c:1001
#7  0xc07b5a8f in Xint0x80_syscall () at ../../../i386/i386/exception.s:201
#8  0x0000002f in ?? ()
#9  0x0000002f in ?? ()
#10 0x0000002f in ?? ()
#11 0xbfbfeb40 in ?? ()
#12 0x082303a0 in ?? ()
#13 0xbfbfea98 in ?? ()
- ---Type <return> to continue, or q <return> to quit---
#14 0xe730cd74 in ?? ()
#15 0x082312c0 in ?? ()
#16 0x08066780 in ?? ()
#17 0x00000013 in ?? ()
#18 0x00000016 in ?? ()
#19 0x0000000c in ?? ()
#20 0x00000002 in ?? ()
#21 0x280df41b in ?? ()
#22 0x0000001f in ?? ()
#23 0x00000296 in ?? ()
#24 0xbfbfea7c in ?? ()
#25 0x0000002f in ?? ()
#26 0x00000000 in ?? ()
#27 0x00000000 in ?? ()
#28 0x00000000 in ?? ()
#29 0x00000000 in ?? ()
#30 0x30c18000 in ?? ()
#31 0xc29e2000 in ?? ()
#32 0xc26a6780 in ?? ()
#33 0xe730ca6c in ?? ()
#34 0xe730ca54 in ?? ()
#35 0xc22a9a80 in ?? ()
#36 0xc062573b in sched_switch (td=0x82303a0, newtd=0x82312c0, flags=Cannot
access memory at address 0xbfbfeaa8
)
- ---Type <return> to continue, or q <return> to quit---
    at ../../../kern/sched_4bsd.c:881
Previous frame inner to this frame (corrupt stack?)
(kgdb)
- --- backtrace ends here ---


This is also reproduceable under FreeBSD 5.3-RELEASE-p23 with updated am-utils
from the ports collection:

# uname -a
FreeBSD xxxxx.eng.netapp.com 5.3-RELEASE-p23 FreeBSD 5.3-RELEASE-p23 #7: Wed Nov
 2 18:04:49 PST 2005     root@xxxxx.eng.netapp.com:/usr/obj/usr/src/sys/SMP  i386

# more info.3
Good dump found on device /dev/da0s1b
   Architecture: i386
   Architecture version: 1
   Dump length: 1610350592B (1535 MB)
   Blocksize: 512
   Dumptime: Tue Nov  8 19:37:29 2005
   Hostname: xxxxx.eng.netapp.com
   Versionstring: FreeBSD 5.3-RELEASE-p23 #7: Wed Nov  2 18:04:49 PST 2005
     root@xxxxx.eng.netapp.com:/usr/obj/usr/src/sys/SMP
   Panicstring: unmount: dangling vnode
   Bounds: 3

similar backtrace:

#0  doadump () at pcpu.h:159
#1  0xc060d54f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:397
#2  0xc060d875 in panic (fmt=0xc0821eb7 "unmount: dangling vnode")
     at /usr/src/sys/kern/kern_shutdown.c:553
#3  0xc065bd11 in vfs_mount_destroy (mp=0xc2f52c00, td=0xc3420c80)
     at /usr/src/sys/kern/vfs_mount.c:523
#4  0xc065d2d8 in dounmount (mp=0xc2f52c00, flags=0, td=0xc3420c80)
     at /usr/src/sys/kern/vfs_mount.c:1160
#5  0xc065cf14 in unmount (td=0xc3420c80, uap=0xe816ad14)
     at /usr/src/sys/kern/vfs_mount.c:1068
#6  0xc07bb33f in syscall (frame=
       {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = 136871872,
tf_eb                        p = -1077941512, tf_isp = -401166988, tf_ebx =
- -2012650824, tf_edx = 135024768,                         tf_ecx = 134711040,
tf_eax = 22, tf_trapno = 22, tf_err = 2, tf_eip = -201226967                   
    3, tf_cs = 31, tf_eflags = 662, tf_esp = -1077941556, tf_ss = 47})
     at /usr/src/sys/i386/i386/trap.c:1001
#7  0xc07a8acf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:201
#8  0x0000002f in ?? ()
#9  0x0000002f in ?? ()
#10 0x0000002f in ?? ()
#11 0x00000000 in ?? ()
#12 0x08287fc0 in ?? ()
#13 0xbfbfeaf8 in ?? ()
- ---Type <return> to continue, or q <return> to quit---
#14 0xe816ad74 in ?? ()
#15 0x880962b8 in ?? ()
#16 0x080c5080 in ?? ()
#17 0x08078700 in ?? ()
#18 0x00000016 in ?? ()
#19 0x00000016 in ?? ()
#20 0x00000002 in ?? ()
#21 0x880f3397 in ?? ()
#22 0x0000001f in ?? ()
#23 0x00000296 in ?? ()
#24 0xbfbfeacc in ?? ()
#25 0x0000002f in ?? ()
#26 0x8806a3d8 in ?? ()
#27 0x00000000 in ?? ()
#28 0x88053868 in ?? ()
#29 0x8805169b in ?? ()
#30 0x307b6000 in ?? ()
#31 0xc35ab388 in ?? ()
#32 0xc3420c80 in ?? ()
#33 0xe816aa6c in ?? ()
#34 0xe816aa54 in ?? ()
#35 0xc2bc2190 in ?? ()
#36 0xc061db57 in sched_switch (td=0x8287fc0, newtd=0x880962b8, flags=Cannot
access memory at address 0xbfbfeb08
)
- ---Type <return> to continue, or q <return> to quit---
     at /usr/src/sys/kern/sched_4bsd.c:865
Previous frame inner to this frame (corrupt stack?)


# /usr/sbin/amd -v
Copyright (c) 1997-2005 Erez Zadok
Copyright (c) 1990 Jan-Simon Pendry
Copyright (c) 1990 Imperial College of Science, Technology & Medicine
Copyright (c) 1990 The Regents of the University of California.
am-utils version 6.1.2.1 (build 1).
Report bugs to https://bugzilla.am-utils.org/ or am-utils@am-utils.org.
Configured by root@xxxxx.eng.netapp.com on date Tue Nov  8 17:35:46 PST 2005.
Built by root@xxxxx.eng.netapp.com on date Tue Nov 8 17:39:11 PST 2005.
cpu=i386 (little-endian), arch=i386, karch=i386.
full_os=freebsd5.3, os=freebsd5, osver=5.3, vendor=portbld, distro=none.
domain=eng.netapp.com, host=xxxxx, hostd=xxxxx.eng.netapp.com.
Map support for: root, passwd, hesiod, union, nis, file, exec, error.
AMFS: nfs, link, nfsx, nfsl, host, linkx, program, union, ufs, cdfs,
       pcfs, auto, direct, toplvl, error, inherit.
FS: cd9660, nfs, nfs3, msdosfs, ufs, unionfs.
Network 1: wire="10.56.112.0" (netnumber=10.56.112).
Network 2: wire="10.56.8.0" (netnumber=10.56.8).

Hide  ===== Cores [ top ]   ===================================


------- End of Forwarded Message


From owner-freebsd-fs@FreeBSD.ORG  Thu Nov 10 03:53:09 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 77E8B16A42A
	for <freebsd-fs@freebsd.org>; Thu, 10 Nov 2005 03:53:09 +0000 (GMT)
	(envelope-from sudakov@sibptus.tomsk.ru)
Received: from relay2.tomsk.ru (relay2.tomsk.ru [212.73.124.8])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4224A43D46
	for <freebsd-fs@freebsd.org>; Thu, 10 Nov 2005 03:53:07 +0000 (GMT)
	(envelope-from sudakov@sibptus.tomsk.ru)
X-Virus-Scanned: by Dr.Web (R) daemon for FreeBSD,
	version 4.32.1 (2004-08-30) at relay2.tomsk.ru
Received: from [172.16.138.125] (account sudakovva@sibptus.tomsk.ru HELO
	admin.sibptus.tomsk.ru)
	by relay2.tomsk.ru (CommuniGate Pro SMTP 4.3.8)
	with ESMTPSA id 1575269 for freebsd-fs@freebsd.org;
	Thu, 10 Nov 2005 09:53:06 +0600
Received: (from sudakov@localhost)
	by admin.sibptus.tomsk.ru (8.12.9p2/8.12.9/Submit) id jAA3r5lq053669
	for freebsd-fs@freebsd.org; Thu, 10 Nov 2005 09:53:05 +0600 (OMST)
	(envelope-from sudakov@sibptus.tomsk.ru)
X-Authentication-Warning: admin.sibptus.tomsk.ru: sudakov set sender to
	sudakov@sibptus.tomsk.ru using -f
Date: Thu, 10 Nov 2005 09:53:05 +0600
From: Victor Sudakov <sudakov@sibptus.tomsk.ru>
To: freebsd-fs@freebsd.org
Message-ID: <20051110035305.GA53569@admin.sibptus.tomsk.ru>
References: <434F9DAE.6070607@ant.uni-bremen.de>
	<20051014134820.GA43849@admin.sibptus.tomsk.ru>
	<20051014203021.L66014@fledge.watson.org>
	<435351F7.10101@ant.uni-bremen.de>
	<20051017141609.GA83692@admin.sibptus.tomsk.ru>
	<4354D850.8060908@ant.uni-bremen.de>
	<20051018112135.GA94670@admin.sibptus.tomsk.ru>
	<4354E644.7090608@ant.uni-bremen.de>
	<20051018154627.GB95892@admin.sibptus.tomsk.ru>
	<4355FD57.3060102@ant.uni-bremen.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4355FD57.3060102@ant.uni-bremen.de>
User-Agent: Mutt/1.4.2.1i
Organization: AO "Svyaztransneft", SibPTUS
X-PGP-Key: http://vas.tomsk.ru/vas.asc
Subject: Re: Problem with default ACLs and mask
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Nov 2005 03:53:09 -0000

Heinrich Rebehn wrote:
> >>>>Very sad :-( It really seems to be impossible to implment something like
> >>>>a "Group Manager" enabling me to delegate priviliges for a group of
> >>>>users to some non-root person.
> >>>
> >>>
> >>>What OS allows you to do it?
> >>>
> >>
> >>I have done such things with OpenVMS. Dunno how much control
> >>Windows/NTFS allows.
> > 
> > 
> > Doesn't OpenVMS also have the concept of default ACLs on directories?
> > How is the matter handled there?
> > 
> Yes, it has. But it does not have the concept of a "mask", which limits
> the resulting access rights.
> 
> In OpenVMS, group members can also "lock out" the group manager by
> removing the ACLs. But they must do so on purpose, and the group manager
> can talk to them if that happens.
> 
> With Posix1e however, users can inadvertently create directories with
> the group write bit removed (by extracting a tar ball), which the group
> manager is then unable to delete.

Moreover, I recently came across another issue. Consider the following
scenario. You set a default ACL on the directory "test". Your user
creates a file somewhere else and then moves it into "test". Provided
"test" and the other directory are on the same filesystem, the file
will not inherit the default ACLs from "test". It will be inside
"test", but with a different set of ACLs.

M$ Windows works exactly the same way if both the directories are on
the same volume.

How does OpenVMS handle such a scenario? 

-- 
Victor Sudakov,  VAS4-RIPE, VAS47-RIPN
sip:sudakov@sibptus.tomsk.ru

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 11 16:53:49 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id F20AE16A41F
	for <freebsd-fs@freebsd.org>; Fri, 11 Nov 2005 16:53:49 +0000 (GMT)
	(envelope-from user@dhp.com)
Received: from shell.dhp.com (shell.dhp.com [199.245.105.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6A59843D45
	for <freebsd-fs@freebsd.org>; Fri, 11 Nov 2005 16:53:49 +0000 (GMT)
	(envelope-from user@dhp.com)
Received: by shell.dhp.com (Postfix, from userid 896)
	id A1D9431372; Fri, 11 Nov 2005 11:53:46 -0500 (EST)
Date: Fri, 11 Nov 2005 11:53:46 -0500 (EST)
From: user <user@dhp.com>
To: freebsd-fs@freebsd.org
Message-ID: <Pine.LNX.4.21.0511111150460.8180-100000@shell.dhp.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Cc: 
Subject: three follow-up questions RE: UFS2 snapshots on large filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Nov 2005 16:53:50 -0000


thank you scott - see below:

On Fri, 4 Nov 2005, Scott Long wrote:

> The UFS snapshot code was written at a time when disks were typically 
> around 4-9GB in size, not 400GB in size =-)  Unfortunately, the amount
> of time it takes to do the initial snapshot bookkeeping scales linearly
> with the size of the drive, and many people have reported that it takes
> considerable amount of time (anywhere from several minutes to several 
> dozen minutes) on large drives/arrays like you describe.  So, you should
> test and plan accordingly if you are interested in using them.


Testing is what I need to do.  I have a few follow up questions:

First, are there any sysctl or kernel tunables that change any of what you
are discussing above ?

Second, let's say I am willing to accept the long snapshot creation period
... are there other drawbacks as well during the course of _running with_
the snapshot once it is created ?  Or are all costs paid initially ?

Finally, I have read the bsdcon3 paper that mccusick wrote where he
addressed the dual problems of not enough kernel memory (10 megabytes) to
cache disk pages, and the system deadlocking that occurs with two
snapshots.  Is it true that both of the fixes he elucidated in that paper
are built into what I see as fbsd 5.4 now ?

Thanks.


From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 11 17:29:07 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8638B16A41F
	for <freebsd-fs@freebsd.org>; Fri, 11 Nov 2005 17:29:07 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from mh2.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0EDF843D49
	for <freebsd-fs@freebsd.org>; Fri, 11 Nov 2005 17:29:06 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh2.centtech.com (8.13.1/8.13.1) with ESMTP id jABHT5P3002704;
	Fri, 11 Nov 2005 11:29:05 -0600 (CST)
	(envelope-from anderson@centtech.com)
Message-ID: <4374D4DB.7050109@centtech.com>
Date: Fri, 11 Nov 2005 11:28:59 -0600
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20051021
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: user <user@dhp.com>
References: <Pine.LNX.4.21.0511111150460.8180-100000@shell.dhp.com>
In-Reply-To: <Pine.LNX.4.21.0511111150460.8180-100000@shell.dhp.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.82/1168/Thu Nov 10 12:23:40 2005 on mh2.centtech.com
X-Virus-Status: Clean
Cc: freebsd-fs@freebsd.org
Subject: Re: three follow-up questions RE: UFS2 snapshots on large
	filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Nov 2005 17:29:07 -0000

user wrote:
> 
> thank you scott - see below:
> 
> On Fri, 4 Nov 2005, Scott Long wrote:
> 
> 
>>The UFS snapshot code was written at a time when disks were typically 
>>around 4-9GB in size, not 400GB in size =-)  Unfortunately, the amount
>>of time it takes to do the initial snapshot bookkeeping scales linearly
>>with the size of the drive, and many people have reported that it takes
>>considerable amount of time (anywhere from several minutes to several 
>>dozen minutes) on large drives/arrays like you describe.  So, you should
>>test and plan accordingly if you are interested in using them.
> 
> 
> 
> Testing is what I need to do.  I have a few follow up questions:
> 
> First, are there any sysctl or kernel tunables that change any of what you
> are discussing above ?
> 
> Second, let's say I am willing to accept the long snapshot creation period
> ... are there other drawbacks as well during the course of _running with_
> the snapshot once it is created ?  Or are all costs paid initially ?

There will be a slight performance penalty with snapshots on the 
filesystem, but it shouldn't be noticeable.

> Finally, I have read the bsdcon3 paper that mccusick wrote where he
> addressed the dual problems of not enough kernel memory (10 megabytes) to
> cache disk pages, and the system deadlocking that occurs with two
> snapshots.  Is it true that both of the fixes he elucidated in that paper
> are built into what I see as fbsd 5.4 now ?

I've seen deadlocking on 5.4, however some recent work on the filesystem 
and snapshots went into 6.0 and -CURRENT, and since then, I've seen no 
issues.  I highly recommend 6.0 for this.

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Fri Nov 11 18:10:46 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E7D7316A41F
	for <freebsd-fs@freebsd.org>; Fri, 11 Nov 2005 18:10:46 +0000 (GMT)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 42F3143D4C
	for <freebsd-fs@freebsd.org>; Fri, 11 Nov 2005 18:10:41 +0000 (GMT)
	(envelope-from scottl@samsco.org)
Received: from [192.168.254.11] (junior.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id jABIAbt8040754;
	Fri, 11 Nov 2005 11:10:37 -0700 (MST)
	(envelope-from scottl@samsco.org)
Message-ID: <4374DEA9.9020706@samsco.org>
Date: Fri, 11 Nov 2005 11:10:49 -0700
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050615
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: user <user@dhp.com>
References: <Pine.LNX.4.21.0511111150460.8180-100000@shell.dhp.com>
In-Reply-To: <Pine.LNX.4.21.0511111150460.8180-100000@shell.dhp.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed 
	version=3.1.0
X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org
Cc: freebsd-fs@freebsd.org
Subject: Re: three follow-up questions RE: UFS2 snapshots on large
	filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Nov 2005 18:10:47 -0000

user wrote:
> 
> thank you scott - see below:
> 
> On Fri, 4 Nov 2005, Scott Long wrote:
> 
> 
>>The UFS snapshot code was written at a time when disks were typically 
>>around 4-9GB in size, not 400GB in size =-)  Unfortunately, the amount
>>of time it takes to do the initial snapshot bookkeeping scales linearly
>>with the size of the drive, and many people have reported that it takes
>>considerable amount of time (anywhere from several minutes to several 
>>dozen minutes) on large drives/arrays like you describe.  So, you should
>>test and plan accordingly if you are interested in using them.
> 
> 
> 
> Testing is what I need to do.  I have a few follow up questions:
> 
> First, are there any sysctl or kernel tunables that change any of what you
> are discussing above ?

There doesn't appear to be any tunables in the snapshot code other than
for debugging.

> 
> Second, let's say I am willing to accept the long snapshot creation period
> ... are there other drawbacks as well during the course of _running with_
> the snapshot once it is created ?  Or are all costs paid initially ?

There is a slight performance penalty from tracking block changes and
copying them to the snapshot file.  It's fairly small, though, not
enough to impact normal use.

> 
> Finally, I have read the bsdcon3 paper that mccusick wrote where he
> addressed the dual problems of not enough kernel memory (10 megabytes) to
> cache disk pages, and the system deadlocking that occurs with two
> snapshots.  Is it true that both of the fixes he elucidated in that paper
> are built into what I see as fbsd 5.4 now ?
> 

That I do not know.  There have been a number of deadlock fixes in over 
the past few years, and some a few months ago in particular, but I 
haven't tracked them closely enough to know.

Scott