From owner-freebsd-fs@FreeBSD.ORG  Sun Nov 13 17:17:37 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6380116A420
	for <freebsd-fs@freebsd.org>; Sun, 13 Nov 2005 17:17:37 +0000 (GMT)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A64FE43D72
	for <freebsd-fs@freebsd.org>; Sun, 13 Nov 2005 17:17:26 +0000 (GMT)
	(envelope-from scottl@samsco.org)
Received: from [192.168.254.11] (junior.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id jADHH8SP053474;
	Sun, 13 Nov 2005 10:17:08 -0700 (MST)
	(envelope-from scottl@samsco.org)
Message-ID: <43777523.8020709@samsco.org>
Date: Sun, 13 Nov 2005 10:17:23 -0700
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050615
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: delphij@delphij.net
References: <Pine.LNX.4.21.0511041531210.8180-100000@shell.dhp.com>	
	<436BDB99.5060907@samsco.org>
	<a78074950511130907g24c079c4gccd6c1d750d244da@mail.gmail.com>
In-Reply-To: <a78074950511130907g24c079c4gccd6c1d750d244da@mail.gmail.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed 
	version=3.1.0
X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on pooker.samsco.org
Cc: freebsd-fs@freebsd.org, user <user@dhp.com>
Subject: Re: UFS2 snapshots on large filesystems
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Nov 2005 17:17:37 -0000

Xin LI wrote:
> On 11/5/05, Scott Long <scottl@samsco.org> wrote:
> 
>>The UFS snapshot code was written at a time when disks were typically
>>around 4-9GB in size, not 400GB in size =-)  Unfortunately, the amount
> 
> 
> s/size/cylinder groups/g :-)
> 
> 
>>of time it takes to do the initial snapshot bookkeeping scales linearly
>>with the size of the drive, and many people have reported that it takes
>>considerable amount of time (anywhere from several minutes to several
>>dozen minutes) on large drives/arrays like you describe.  So, you should
>>test and plan accordingly if you are interested in using them.
> 
> 
> I have some ideas about lazy snapshotting.  But unfortunately I don't
> have much time to implement a prototype ATM, and I think we really
> need a file system that is capable for:
>  - Handling large number of files in one directory (say, some sort of
> indexing mechanism, etc.  And yes, I know that this is somewhat
> insane, but the [ab]use is present in many large e-mail systems that
> uses mailbox)
>  - Effective recovery.  Personally I do not buy journalling much, and
> I think the problem could be resolved by something like WAFL did.
> 
> I think that JUFS would provide some help for (2), do you have some
> plan about (1)?
> 

I guess that UFS_DIRHASH doesn't give enough benefit for your situation?
The idea of doing alternate directory layouts (such as b-trees) has been
proposed a number of times.  Apparently there was an idea at one point
for UFS to generate a b-tree layout for directory and and save it on
disk as a cache.  The primary method of directory storage would remain
the traditional linear way so that compatibility is preserved, but OS's
that were aware of the cache could use it too.  There are still some
reserved flags and fields in UFS2 for doing this, in case you're
interested.  Since it requires double bookkeeping for link creation and
removal, I'm not sure how speedy it is for anything other than
VOP_LOOKUP operations.  An alternate idea I've had is to break with
compatibility and doing b-trees or something similar as the native
format for UFS3 (along with native journalling and other things).

Scott