From owner-freebsd-fs@FreeBSD.ORG  Wed Jul 20 02:16:23 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 209CD16A41F
	for <freebsd-fs@freebsd.org>; Wed, 20 Jul 2005 02:16:22 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7433D43D49
	for <freebsd-fs@freebsd.org>; Wed, 20 Jul 2005 02:16:21 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [192.168.42.23] (andersonbox3.centtech.com [192.168.42.23])
	by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id j6K2GKHW038162
	for <freebsd-fs@freebsd.org>; Tue, 19 Jul 2005 21:16:20 -0500 (CDT)
	(envelope-from anderson@centtech.com)
Message-ID: <42DDB3F2.7020000@centtech.com>
Date: Tue, 19 Jul 2005 21:16:18 -0500
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050603
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <200507020038.j620cO7F071025@gate.bitblocks.com>
In-Reply-To: <200507020038.j620cO7F071025@gate.bitblocks.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.82/984/Tue Jul 19 04:16:09 2005 on mh1.centtech.com
X-Virus-Status: Clean
Subject: Re: Cluster Filesystem for FreeBSD - any interest?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2005 02:16:23 -0000

Bakul Shah wrote:
[..snip..]
>>:) I understand.  Any nudging in the right direction here would be
>>appreciated.
> 
> 
> I'd probably start with modelling a single filesystem and how
> it maps to a sequence of disk blocks (*without* using any
> code or worrying about details of formats but capturing the
> essential elements).  I'd describe various operations in
> terms of preconditions and postconditions.  Then, I'd extend
> the model to deal with redundancy and so on.  Then I'd model
> various failure modes. etc.  If you are interested _enough_
> we can take this offline and try to work something out.  You
> may even be able to use perl to create an `executable'
> specification:-)

I've done some research, and read some books/articles/white papers since 
I started this thread.

First, porting GFS might be a more universal effort, and might be 
'easier'.  However, that doesn't get us a clustered filesystem with BSD 
license (something that sounds good to me).

Clustering UFS2 would be cool.  Here's what I'm looking for:

A clustered filesystem (or layer?) that allows all machines in the 
cluster to see the same filesystem as if it were local, with read/write 
access.  The cluster will need cache coherency across all nodes, and 
there will need to be some sort of lock manager on each node to 
communicate with all the other nodes to coordinate file locking.  The 
filesystem will have to support journaling.

I'm wondering if one could make a pseudo filesystem something like 
nullfs that sits on top of a UFS2 partition, and essentially monitors 
all VFS operations to the filesystem, and communicates them over TCP/IP 
to the other nodes in the cluster.  That way, each node would know which 
inodes and blocks are changing, so they can flush those buffers, and 
they would know which blocks (or partial blocks) to view as locked as 
another node locks it. This could be done via multicast, so all nodes in 
the cluster would have to be running a distributed lock manager daemon 
(dlmd) that would coordinate this.  I think also that the UFS2 
filesystem would have to have a bit set upon mount that tracked it's 
mount as a 'clustered' filesystem mount.  The reason for that is so that 
we could modify mount to only mount 'clustered' filesystems (mount -o 
clustered) if the dlmd was running, since that would be a dependency for 
stable coherent file control on a mount point.

Does anyone have any insight as to whether a layer would work?  Or maybe 
I'm way off here and I need to do more reading :)

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
A lost ounce of gold may be found, a lost moment of time never.
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Wed Jul 20 02:37:26 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 78C3316A41F
	for <freebsd-fs@freebsd.org>; Wed, 20 Jul 2005 02:37:26 +0000 (GMT)
	(envelope-from yfyoufeng@263.net)
Received: from smtp.263.net (mx01.263.net.cn [211.150.96.22])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 49AFC43D45
	for <freebsd-fs@freebsd.org>; Wed, 20 Jul 2005 02:37:14 +0000 (GMT)
	(envelope-from yfyoufeng@263.net)
Received: from [10.217.12.183] (localhost [127.0.0.1])
	by smtp.263.net (Postfix) with ESMTP
	id 32249C35B0; Wed, 20 Jul 2005 10:37:03 +0800 (CST)
	(envelope-from yfyoufeng@263.net)
Received: from [10.217.12.183] (unknown [61.135.152.194])
	by antispam-2 (Coremail:263(050316)) with SMTP id y0DmAM+43UJCB5jC.1
	for <anderson@centtech.com>; Wed, 20 Jul 2005 10:37:03 +0800 (CST)
X-TEBIE-Originating-IP: [61.135.152.194]
From: yf-263 <yfyoufeng@263.net>
To: Eric Anderson <anderson@centtech.com>
In-Reply-To: <42DDB3F2.7020000@centtech.com>
References: <200507020038.j620cO7F071025@gate.bitblocks.com>
	<42DDB3F2.7020000@centtech.com>
Content-Type: text/plain; charset=UTF-8
Organization: Unix-driver.org
Date: Wed, 20 Jul 2005 10:35:46 +0800
Message-Id: <1121826946.2235.6.camel@localhost.localdomain>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.2 (2.2.2-5) 
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@freebsd.org
Subject: Re: Cluster Filesystem for FreeBSD - any interest?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: yfyoufeng@263.net
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2005 02:37:26 -0000

在 2005-07-19二的 21:16 -0500，Eric Anderson写道：
> Bakul Shah wrote:
> [..snip..]
> >>:) I understand.  Any nudging in the right direction here would be
> >>appreciated.
> > 
> > 
> > I'd probably start with modelling a single filesystem and how
> > it maps to a sequence of disk blocks (*without* using any
> > code or worrying about details of formats but capturing the
> > essential elements).  I'd describe various operations in
> > terms of preconditions and postconditions.  Then, I'd extend
> > the model to deal with redundancy and so on.  Then I'd model
> > various failure modes. etc.  If you are interested _enough_
> > we can take this offline and try to work something out.  You
> > may even be able to use perl to create an `executable'
> > specification:-)
> 
> I've done some research, and read some books/articles/white papers since 
> I started this thread.
> 
> First, porting GFS might be a more universal effort, and might be 
> 'easier'.  However, that doesn't get us a clustered filesystem with BSD 
> license (something that sounds good to me).

It has been said it would be a seven man-month efforts for a FS expert.

> 
> Clustering UFS2 would be cool.  Here's what I'm looking for:

It is exactly how "Lustre" doing its work, though it build itself on
Ext3, and Lustre targets at  http://www.lustre.org/docs/SGSRFP.pdf .

> 
> A clustered filesystem (or layer?) that allows all machines in the 
> cluster to see the same filesystem as if it were local, with read/write 
> access.  The cluster will need cache coherency across all nodes, and 
> there will need to be some sort of lock manager on each node to 
> communicate with all the other nodes to coordinate file locking.  The 
> filesystem will have to support journaling.
> 
> I'm wondering if one could make a pseudo filesystem something like 
> nullfs that sits on top of a UFS2 partition, and essentially monitors 
> all VFS operations to the filesystem, and communicates them over TCP/IP 
> to the other nodes in the cluster.  That way, each node would know which 
> inodes and blocks are changing, so they can flush those buffers, and 
> they would know which blocks (or partial blocks) to view as locked as 
> another node locks it. This could be done via multicast, so all nodes in 
> the cluster would have to be running a distributed lock manager daemon 
> (dlmd) that would coordinate this.  I think also that the UFS2 
> filesystem would have to have a bit set upon mount that tracked it's 
> mount as a 'clustered' filesystem mount.  The reason for that is so that 
> we could modify mount to only mount 'clustered' filesystems (mount -o 
> clustered) if the dlmd was running, since that would be a dependency for 
> stable coherent file control on a mount point.
> 
> Does anyone have any insight as to whether a layer would work?  Or maybe 
> I'm way off here and I need to do more reading :)
> 
> Eric
> 
> 
> 
-- 
yf-263 <yfyoufeng@263.net>
Unix-driver.org


From owner-freebsd-fs@FreeBSD.ORG  Wed Jul 20 09:48:46 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7F22616A41F;
	Wed, 20 Jul 2005 09:48:46 +0000 (GMT) (envelope-from riggs@rrr.de)
Received: from mail-out.m-online.net (mail-out.m-online.net [212.18.0.9])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0943E43D46;
	Wed, 20 Jul 2005 09:48:45 +0000 (GMT) (envelope-from riggs@rrr.de)
Received: from mail.m-online.net (svr20.m-online.net [192.168.3.148])
	by mail-out.m-online.net (Postfix) with ESMTP id 61C91FB1B;
	Wed, 20 Jul 2005 11:48:44 +0200 (CEST)
Received: from marvin.riggiland.au (ppp-62-245-162-72.mnet-online.de
	[62.245.162.72])
	by mail.m-online.net (Postfix) with ESMTP id 197E9C6600;
	Wed, 20 Jul 2005 11:48:42 +0200 (CEST)
Received: from marvin.riggiland.au (localhost [127.0.0.1])
	by marvin.riggiland.au (8.13.3/8.13.3) with ESMTP id j6K9mW0t065237;
	Wed, 20 Jul 2005 11:48:32 +0200 (CEST)
	(envelope-from riggs@marvin.riggiland.au)
Received: (from riggs@localhost)
	by marvin.riggiland.au (8.13.3/8.13.3/Submit) id j6K9mVrd065236;
	Wed, 20 Jul 2005 11:48:31 +0200 (CEST) (envelope-from riggs)
Date: Wed, 20 Jul 2005 11:48:30 +0200
From: "Thomas E. Zander" <riggs@rrr.de>
To: freebsd-current@freebsd.org
Message-ID: <20050720094830.GR782@marvin.riggiland.au>
References: <42DD64AB.3000605@centtech.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
In-Reply-To: <42DD64AB.3000605@centtech.com>
Organization: Chaotic
X-PGP-KeyID: 0xC85996CD
X-PGP-URI: http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xC85996CD
X-PGP-Fingerprint: 4F59 75B4 4CE3 3B00 BC61  5400 8DD4 8929 C859 96CD
X-Mailer: Marvin Mail (Build 1121849089)
X-Operating-System: FreeBSD 5.4-STABLE
Cc: freebsd-fs@freebsd.org
Subject: Re: mksnap_ffs takes 4-5 minutes?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2005 09:48:46 -0000

Hi,

On Tue, 19. Jul 2005, at 15:38 -0500, Eric Anderson wrote
according to [mksnap_ffs takes 4-5 minutes?]:

> This time, when I ran mksnap_ffs, the command took nearly 5 minutes

[...]

> Filesystem        1K-blocks     Used     Avail Capacity iused    ifree 
> /dev/da1s1d       406234604 91799154 281936682    25% 1300303 51197103 

I have a fs here with similar (but smaller size) parameters concerning
inode density and usage:

Filesystem  1K-blocks     Used    Avail Capacity iused    ifree
/dev/ad1s1d 113390248 92926924 18195520    84%  248434 14424460

time mksnap_ffs'ing gives the following result:
0.007u 1.902s 1:51.94 1.6%      5+217k 4493+8646io 0pf+0w

It takes almost 2 minutes which seem to perform similarly to your 5
minutes.
(There was not a single file opened when snapping.)

I'd expect snapping to speed up by reducing the inode number when doing
newfs, but I haven't verified this right now.

Riggs

(f'up to freebsd-fs)

-- 
- "[...] I talked to the computer at great length and
-- explained my view of the Universe to it" said Marvin.
--- And what happened?" pressed Ford.
---- "It committed suicide." said Marvin.

From owner-freebsd-fs@FreeBSD.ORG  Wed Jul 20 11:57:26 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8353616A41F;
	Wed, 20 Jul 2005 11:57:26 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1E4C543D45;
	Wed, 20 Jul 2005 11:57:25 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id j6KBvL43054377;
	Wed, 20 Jul 2005 06:57:25 -0500 (CDT)
	(envelope-from anderson@centtech.com)
Message-ID: <42DE3C1F.9070704@centtech.com>
Date: Wed, 20 Jul 2005 06:57:19 -0500
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050603
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "Thomas E. Zander" <riggs@rrr.de>
References: <42DD64AB.3000605@centtech.com>
	<20050720094830.GR782@marvin.riggiland.au>
In-Reply-To: <20050720094830.GR782@marvin.riggiland.au>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.82/984/Tue Jul 19 04:16:09 2005 on mh1.centtech.com
X-Virus-Status: Clean
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: mksnap_ffs takes 4-5 minutes?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2005 11:57:26 -0000

Thomas E. Zander wrote:
> Hi,
> 
> On Tue, 19. Jul 2005, at 15:38 -0500, Eric Anderson wrote
> according to [mksnap_ffs takes 4-5 minutes?]:
> 
> 
>>This time, when I ran mksnap_ffs, the command took nearly 5 minutes
> 
> 
> [...]
> 
> 
>>Filesystem        1K-blocks     Used     Avail Capacity iused    ifree 
>>/dev/da1s1d       406234604 91799154 281936682    25% 1300303 51197103 
> 
> 
> I have a fs here with similar (but smaller size) parameters concerning
> inode density and usage:
> 
> Filesystem  1K-blocks     Used    Avail Capacity iused    ifree
> /dev/ad1s1d 113390248 92926924 18195520    84%  248434 14424460
> 
> time mksnap_ffs'ing gives the following result:
> 0.007u 1.902s 1:51.94 1.6%      5+217k 4493+8646io 0pf+0w
> 
> It takes almost 2 minutes which seem to perform similarly to your 5
> minutes.
> (There was not a single file opened when snapping.)
> 
> I'd expect snapping to speed up by reducing the inode number when doing
> newfs, but I haven't verified this right now.

A 2tb filesystem with the standard newfs options takes about 30 minutes 
to mksnap..  That's unusable really, because the filesystem is suspended 
for so long.  Even empty 2tb filesystems take forever, so it's related 
to the amount of inodes.

How can we make this snappier?

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
A lost ounce of gold may be found, a lost moment of time never.
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Wed Jul 20 12:34:31 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7779116A41F
	for <freebsd-fs@freebsd.org>; Wed, 20 Jul 2005 12:34:31 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0FDD643D46
	for <freebsd-fs@freebsd.org>; Wed, 20 Jul 2005 12:34:30 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id j6KCYKgk055315;
	Wed, 20 Jul 2005 07:34:24 -0500 (CDT)
	(envelope-from anderson@centtech.com)
Message-ID: <42DE44CA.7080503@centtech.com>
Date: Wed, 20 Jul 2005 07:34:18 -0500
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050603
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: yfyoufeng@263.net
References: <200507020038.j620cO7F071025@gate.bitblocks.com>	
	<42DDB3F2.7020000@centtech.com>
	<1121826946.2235.6.camel@localhost.localdomain>
In-Reply-To: <1121826946.2235.6.camel@localhost.localdomain>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Virus-Scanned: ClamAV 0.82/984/Tue Jul 19 04:16:09 2005 on mh1.centtech.com
X-Virus-Status: Clean
Cc: freebsd-fs@freebsd.org
Subject: Re: Cluster Filesystem for FreeBSD - any interest?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2005 12:34:31 -0000

yf-263 wrote:
> =E5=9C=A8 2005-07-19=E4=BA=8C=E7=9A=84 21:16 -0500=EF=BC=8CEric Anderso=
n=E5=86=99=E9=81=93=EF=BC=9A
>=20
>>Bakul Shah wrote:
>>[..snip..]
>>
>>>>:) I understand.  Any nudging in the right direction here would be
>>>>appreciated.
>>>
>>>
>>>I'd probably start with modelling a single filesystem and how
>>>it maps to a sequence of disk blocks (*without* using any
>>>code or worrying about details of formats but capturing the
>>>essential elements).  I'd describe various operations in
>>>terms of preconditions and postconditions.  Then, I'd extend
>>>the model to deal with redundancy and so on.  Then I'd model
>>>various failure modes. etc.  If you are interested _enough_
>>>we can take this offline and try to work something out.  You
>>>may even be able to use perl to create an `executable'
>>>specification:-)
>>
>>I've done some research, and read some books/articles/white papers sinc=
e=20
>>I started this thread.
>>
>>First, porting GFS might be a more universal effort, and might be=20
>>'easier'.  However, that doesn't get us a clustered filesystem with BSD=
=20
>>license (something that sounds good to me).
>=20
>=20
> It has been said it would be a seven man-month efforts for a FS expert.=


Then we need to get a small group together and get started..


>>Clustering UFS2 would be cool.  Here's what I'm looking for:
>=20
>=20
> It is exactly how "Lustre" doing its work, though it build itself on
> Ext3, and Lustre targets at  http://www.lustre.org/docs/SGSRFP.pdf .

Yes, I've read about lustre.  I like the GFS model much better.


Eric


--=20
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
A lost ounce of gold may be found, a lost moment of time never.
------------------------------------------------------------------------


From owner-freebsd-fs@FreeBSD.ORG  Wed Jul 20 13:05:49 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DBBF416A41F;
	Wed, 20 Jul 2005 13:05:49 +0000 (GMT) (envelope-from riggs@rrr.de)
Received: from mail-out.m-online.net (mail-out.m-online.net [212.18.0.9])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4947743D4C;
	Wed, 20 Jul 2005 13:05:34 +0000 (GMT) (envelope-from riggs@rrr.de)
Received: from mail.m-online.net (svr20.m-online.net [192.168.3.148])
	by mail-out.m-online.net (Postfix) with ESMTP id EA25EFCB1;
	Wed, 20 Jul 2005 15:05:32 +0200 (CEST)
Received: from marvin.riggiland.au (ppp-62-245-162-72.mnet-online.de
	[62.245.162.72])
	by mail.m-online.net (Postfix) with ESMTP id 5C936C6C83;
	Wed, 20 Jul 2005 15:05:31 +0200 (CEST)
Received: from marvin.riggiland.au (localhost [127.0.0.1])
	by marvin.riggiland.au (8.13.3/8.13.3) with ESMTP id j6KD5R1D066474;
	Wed, 20 Jul 2005 15:05:28 +0200 (CEST)
	(envelope-from riggs@marvin.riggiland.au)
Received: (from riggs@localhost)
	by marvin.riggiland.au (8.13.3/8.13.3/Submit) id j6KD5PSX066473;
	Wed, 20 Jul 2005 15:05:25 +0200 (CEST) (envelope-from riggs)
Date: Wed, 20 Jul 2005 15:05:23 +0200
From: "Thomas E. Zander" <riggs@rrr.de>
To: Eric Anderson <anderson@centtech.com>
Message-ID: <20050720130523.GT782@marvin.riggiland.au>
References: <42DD64AB.3000605@centtech.com>
	<20050720094830.GR782@marvin.riggiland.au>
	<42DE3C1F.9070704@centtech.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Disposition: inline
In-Reply-To: <42DE3C1F.9070704@centtech.com>
Organization: Chaotic
X-PGP-KeyID: 0xC85996CD
X-PGP-URI: http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xC85996CD
X-PGP-Fingerprint: 4F59 75B4 4CE3 3B00 BC61  5400 8DD4 8929 C859 96CD
X-Mailer: Marvin Mail (Build 1121863362)
X-Operating-System: FreeBSD 5.4-STABLE
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: mksnap_ffs takes 4-5 minutes?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2005 13:05:50 -0000

Hi,

On Wed, 20. Jul 2005, at  6:57 -0500, Eric Anderson wrote
according to [Re: mksnap_ffs takes 4-5 minutes?]:

> A 2tb filesystem with the standard newfs options takes about 30 minutes 
> to mksnap..  That's unusable really, because the filesystem is suspended 
> for so long.  Even empty 2tb filesystems take forever, so it's related 
> to the amount of inodes.
> 
> How can we make this snappier?

For the moment we can workaround by setting inode density appropriately
when creating the fs. However this is only feasible if you know what
your users are going to do with the fs; it also doesn't help when you
*need* a large fs containing many small files.
In the long run, dynamic inode (de)allocation would be nice to have.

Also...what about the 'preparation' time for snapping? IIRC McKusick
said that the lion's share of snapping time is used to delay pending
transactions before actually doing the snap.
There are quite some scenarios in which you can be certain that there
is no file opened for writing, so a snap could be taken immediately.
Would it be feasible to implement this feature? Or am I completely
wrong?

Riggs

-- 
- "[...] I talked to the computer at great length and
-- explained my view of the Universe to it" said Marvin.
--- And what happened?" pressed Ford.
---- "It committed suicide." said Marvin.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jul 21 13:46:04 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 617B016A420
	for <freebsd-fs@freebsd.org>; Thu, 21 Jul 2005 13:46:04 +0000 (GMT)
	(envelope-from nb@ravenbrook.com)
Received: from raven.ravenbrook.com (raven.ravenbrook.com [193.82.131.18])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 10AD843D81
	for <freebsd-fs@freebsd.org>; Thu, 21 Jul 2005 13:45:55 +0000 (GMT)
	(envelope-from nb@ravenbrook.com)
Received: from thrush.ravenbrook.com (thrush.ravenbrook.com [193.112.141.145])
	by raven.ravenbrook.com (8.12.6p3/8.12.6) with ESMTP id
	j6LDjqXi030013
	for <freebsd-fs@freebsd.org>; Thu, 21 Jul 2005 14:45:53 +0100 (BST)
	(envelope-from nb@ravenbrook.com)
Received: from thrush.ravenbrook.com (localhost [127.0.0.1])
	by thrush.ravenbrook.com (8.12.9p2/8.12.9) with ESMTP id j6LDjqFM073798
	for <freebsd-fs@freebsd.org>; Thu, 21 Jul 2005 14:45:52 +0100 (BST)
	(envelope-from nb@thrush.ravenbrook.com)
From: Nick Barnes <Nick.Barnes@pobox.com>
To: freebsd-fs@freebsd.org
Date: Thu, 21 Jul 2005 14:45:52 +0100
Message-ID: <73797.1121953552@thrush.ravenbrook.com>
Sender: nb@ravenbrook.com
Subject: CGSIZE inaccuracy?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Jul 2005 13:46:04 -0000

I'm running 4.9-RELEASE, and writing some tools to navigate around my
UFS filesystems to figure out what I have lost when I get bad blocks.
There's not much detailed online documentation of UFS beyond fs(5) and
fs.h, so I'm feeling my way through the sources.

Looking at fs.h, I see this:

    /*
     * The size of a cylinder group is calculated by CGSIZE. The maximum size
     * is limited by the fact that cylinder groups are at most one block.
     * Its size is derived from the size of the maps maintained in the
     * cylinder group and the (struct cg) size.
     */
    #define CGSIZE(fs) \
        /* base cg */       (sizeof(struct cg) + sizeof(int32_t) + \
        /* blktot size */   (fs)->fs_cpg * sizeof(int32_t) + \
        /* blks size */     (fs)->fs_cpg * (fs)->fs_nrpos * sizeof(int16_t) + \
        /* inode map */     howmany((fs)->fs_ipg, NBBY) + \
        /* block map */     howmany((fs)->fs_cpg * (fs)->fs_spc / NSPF(fs), NBBY) +\
        /* if present */    ((fs)->fs_contigsumsize <= 0 ? 0 : \
        /* cluster sum */   (fs)->fs_contigsumsize * sizeof(int32_t) + \
        /* cluster map */   howmany((fs)->fs_cpg * (fs)->fs_spc / NSPB(fs), NBBY)))

In a typical filesystem (fs_fsize = 2048, fs_bsize = 16384, fs_ipg =
22528, fs_cpg = 89, fs_spc = 4096, fs_nrpos = 1, fs_contigsumsize =
7), the parts of this sum add up like this:

   172 struct cg;
     4 int32_t
   356 blktot: free blocks per cylinder;
   178 blks: free blocks per rpos per cylinder;
  2816 inode map, one bit per inode;
 11392 block map, one bit per fragment;
    28 cluster summary, one int32_t per contigsumsize+1;
  1424 cluster map, one bit per block;
------
 16370 CGSIZE

However, using the cg_* macros from fs.h (e.g. cg_clustersum), I get
offsets like this:

  base limit  size
     0-  168   168 struct cg less cg_space
   168-  524   356 cg_blktot (free blocks per cylinder)
   524-  702   178 cg_blks (free blocks per rpos per cylinder)
   702- 3518  2816 cg_inosused (inode bitmap)
  3518-14910 11392 cg_blksfree (fragment bitmap)
 14911-14912     2 padding
 14912-14940    28 cg_clustersum (block cluster summaries)
 14940-16364  1424 cg_clusteroff (block bitmap)
 16364             nextfreeoff

There are three discrepancies here:
+4: sizeof(struct cg) is used instead of offsetof(cg_space);
+4: sizeof(int32_t) is added, mysteriously;
-2: the padding for cg_clustersum is disregarded.

I don't *think* that this matters, because CGSIZE is only apparently
used (by newfs and fsck) as a conservative approximation of the size
of the cylinder group header.  But it seems odd, given that a correct
calculation is fairly easy.

Nick Barnes
Ravenbrook Limited

From owner-freebsd-fs@FreeBSD.ORG  Thu Jul 21 19:10:37 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: fs@freebsd.org
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2C4CE16A421
	for <fs@freebsd.org>; Thu, 21 Jul 2005 19:10:37 +0000 (GMT)
	(envelope-from igor.shmukler@gmail.com)
Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.199])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B73B343D6D
	for <fs@freebsd.org>; Thu, 21 Jul 2005 19:10:31 +0000 (GMT)
	(envelope-from igor.shmukler@gmail.com)
Received: by zproxy.gmail.com with SMTP id c3so40016nze
	for <fs@freebsd.org>; Thu, 21 Jul 2005 12:10:31 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com;
	h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition;
	b=q10kAIOoB6TwJY4GAkJr8JZoP+J3yc3jJhvVzLMocW4Iv53buKmTDRFgU011N9HrKmyjMS/auwTz5bVomvAAphHCpFT+VnJM/ZEqkUdZzKYOnfv/+EBetyRiYeJfr1Qs8cj6A6EVudB4/AeV1SUXjn5CUnSh7TIfPQahtfId5Yo=
Received: by 10.36.129.16 with SMTP id b16mr1235566nzd;
	Thu, 21 Jul 2005 12:10:30 -0700 (PDT)
Received: by 10.36.119.12 with HTTP; Thu, 21 Jul 2005 12:10:30 -0700 (PDT)
Message-ID: <6533c1c9050721121030016b7d@mail.gmail.com>
Date: Thu, 21 Jul 2005 15:10:30 -0400
From: Igor Shmukler <igor.shmukler@gmail.com>
To: fs@freebsd.org, hackers@freebsd.org, dillon@apollo.backplane.com
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Cc: 
Subject: per file lock list
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Igor Shmukler <igor.shmukler@gmail.com>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Jul 2005 19:10:37 -0000

Hi,

We have a question: how to get all POSIX locks for a given file?

As far as I know, existing API does not allow to retrieve all file
locks. Therefore, we need to use kernel internal structures to get all
applied locks. Unfortunately, a head of list with file locks is
attached to inode rather then vnode. As result, it is much harder to
get the lock list head due to the need to know exact inode type that
is hidden behind the vnode.

Of course, the problem could be resolved in a hackish way: we may get
the address of VOP_ADVLOCK() method and compare it with all known FS
methods, that handles this VOP operation: (ufs_advlock, etc.) and
therefore apply a proper type cast to vnode->v_data to get valid
inode. However, this would be a last resort.

So the question: is there an elegant way to get the lock list for a given f=
ile?

Thank you in advance.

From owner-freebsd-fs@FreeBSD.ORG  Thu Jul 21 19:26:05 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: fs@freebsd.org
Delivered-To: freebsd-fs@FreeBSD.ORG
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B058D16A41F;
	Thu, 21 Jul 2005 19:26:05 +0000 (GMT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7A1B343D6A;
	Thu, 21 Jul 2005 19:26:05 +0000 (GMT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.9p2/8.12.9) with ESMTP id j6LJQ5Yk071116; 
	Thu, 21 Jul 2005 12:26:05 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id j6LJQ55D071115;
	Thu, 21 Jul 2005 12:26:05 -0700 (PDT) (envelope-from dillon)
Date: Thu, 21 Jul 2005 12:26:05 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200507211926.j6LJQ55D071115@apollo.backplane.com>
To: Igor Shmukler <igor.shmukler@gmail.com>
References: <6533c1c9050721121030016b7d@mail.gmail.com>
Cc: dillon@apollo.backplane.com, hackers@freebsd.org, fs@freebsd.org
Subject: Re: per file lock list
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Jul 2005 19:26:05 -0000

:Hi,
:
:We have a question: how to get all POSIX locks for a given file?
:..
:
:As far as I know, existing API does not allow to retrieve all file
:locks. Therefore, we need to use kernel internal structures to get all
:...
:So the question: is there an elegant way to get the lock list for a given file?
:
:Thank you in advance.

    You can use F_GETLK to iterate through all posix locks held on a file.
    From man fcntl:

     F_GETLK    Get the first lock that blocks the lock description pointed to
                by the third argument, arg, taken as a pointer to a struct
                flock (see above).  The information retrieved overwrites the
                information passed to fcntl() in the flock structure.  If no
                lock is found that would prevent this lock from being created,
                the structure is left unchanged by this function call except
                for the lock type which is set to F_UNLCK.

    So what you do is you specify a lock description that covers the whole 
    file and call F_GETLK.  You then use the results to modify the lock
    description to a range that starts just past the returned lock
    for the next call.  You continue iterating until F_GETLK tells you that
    there are no more locks.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

From owner-freebsd-fs@FreeBSD.ORG  Fri Jul 22 12:16:57 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 04A9316A433;
	Fri, 22 Jul 2005 12:16:57 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from mh2.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6290043D78;
	Fri, 22 Jul 2005 12:16:34 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh2.centtech.com (8.13.1/8.13.1) with ESMTP id j6MCGVtT089947;
	Fri, 22 Jul 2005 07:16:33 -0500 (CDT)
	(envelope-from anderson@centtech.com)
Message-ID: <42E0E39E.8020009@centtech.com>
Date: Fri, 22 Jul 2005 07:16:30 -0500
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050603
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "Thomas E. Zander" <riggs@rrr.de>
References: <42DD64AB.3000605@centtech.com>
	<20050720094830.GR782@marvin.riggiland.au>
	<42DE3C1F.9070704@centtech.com>
	<20050720130523.GT782@marvin.riggiland.au>
In-Reply-To: <20050720130523.GT782@marvin.riggiland.au>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: mksnap_ffs takes 4-5 minutes?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jul 2005 12:16:57 -0000

Thomas E. Zander wrote:
> Hi,
> 
> On Wed, 20. Jul 2005, at  6:57 -0500, Eric Anderson wrote
> according to [Re: mksnap_ffs takes 4-5 minutes?]:
> 
> 
>>A 2tb filesystem with the standard newfs options takes about 30 minutes 
>>to mksnap..  That's unusable really, because the filesystem is suspended 
>>for so long.  Even empty 2tb filesystems take forever, so it's related 
>>to the amount of inodes.
>>
>>How can we make this snappier?
> 
> 
> For the moment we can workaround by setting inode density appropriately
> when creating the fs. However this is only feasible if you know what
> your users are going to do with the fs; it also doesn't help when you
> *need* a large fs containing many small files.
> In the long run, dynamic inode (de)allocation would be nice to have.

It doesn't seem to make a difference on how much of the filesystem is 
actually used.  It seems to be dependent on how many inodes there are, 
or maybe more appropriately, how many cylinder groups.

> Also...what about the 'preparation' time for snapping? IIRC McKusick
> said that the lion's share of snapping time is used to delay pending
> transactions before actually doing the snap.
> There are quite some scenarios in which you can be certain that there
> is no file opened for writing, so a snap could be taken immediately.
> Would it be feasible to implement this feature? Or am I completely
> wrong?

The snap seemed to suspend the filesystem nearly immediately, and kept 
it suspended for quite some time - I would say probably more than half 
the time.  In order for snapshots to be very useful, it must work on 
large filesystems (100GB+) in a reasonable amount of time (a few seconds 
would be ok).  I know for certain that one test filesystem (2Tb) had 
nothing on it, no processess using the filesystem at all, and it took 
well over an hour to run mksnap on it.


Maybe mksnap is broken somehow?

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
A lost ounce of gold may be found, a lost moment of time never.
------------------------------------------------------------------------

From owner-freebsd-fs@FreeBSD.ORG  Fri Jul 22 13:53:34 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0551F16A45C;
	Fri, 22 Jul 2005 13:53:34 +0000 (GMT)
	(envelope-from e-masson@kisoft-services.com)
Received: from mallaury.nerim.net (smtp-105-friday.noc.nerim.net [62.4.17.105])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B5D8C43DB0;
	Fri, 22 Jul 2005 13:53:28 +0000 (GMT)
	(envelope-from e-masson@kisoft-services.com)
Received: from srvbsdnanssv.interne.kisoft-services.com (kisoft.net1.nerim.net
	[62.212.107.51])
	by mallaury.nerim.net (Postfix) with ESMTP id 431254F3E8;
	Fri, 22 Jul 2005 15:53:17 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1])
	by srvbsdnanssv.interne.kisoft-services.com (Postfix) with ESMTP id
	80CFAC675; Fri, 22 Jul 2005 15:53:43 +0200 (CEST)
Received: from srvbsdnanssv.interne.kisoft-services.com ([127.0.0.1])
	by localhost (srvbsdnanssv.interne.kisoft-services.com [127.0.0.1])
	(amavisd-new, port 10024)
	with ESMTP id 58715-04; Fri, 22 Jul 2005 15:53:39 +0200 (CEST)
Received: by srvbsdnanssv.interne.kisoft-services.com (Postfix,
	from userid 1001)
	id 1206FC667; Fri, 22 Jul 2005 15:53:39 +0200 (CEST)
To: Eric Anderson <anderson@centtech.com>
From: Eric Masson <e-masson@kisoft-services.com>
In-Reply-To: <42E0E39E.8020009@centtech.com> (Eric Anderson's message of
	"Fri, 22 Jul 2005 07:16:30 -0500")
References: <42DD64AB.3000605@centtech.com>
	<20050720094830.GR782@marvin.riggiland.au>
	<42DE3C1F.9070704@centtech.com>
	<20050720130523.GT782@marvin.riggiland.au>
	<42E0E39E.8020009@centtech.com>
X-Operating-System: FreeBSD 5.4-RELEASE-p2 i386
Date: Fri, 22 Jul 2005 15:53:39 +0200
Message-ID: <86irz32bh8.fsf@srvbsdnanssv.interne.kisoft-services.com>
User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Jumbo Shrimp,
	berkeley-unix)
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: 8bit
X-Virus-Scanned: amavisd-new at interne.kisoft-services.com
Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org
Subject: Re: mksnap_ffs takes 4-5 minutes?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jul 2005 13:53:34 -0000

Eric Anderson <anderson@centtech.com> writes:

Hi,

> I know for certain that one test filesystem (2Tb) had nothing on it,
> no processess using the filesystem at all, and it took well over an
> hour to run mksnap on it.

I made some tests on a Dell PowerVault 725N with � 5.2 about a year and
a half ago. Snapshots on a 700GB filesystem were taking roughly half an
hour.

> Maybe mksnap is broken somehow?

Don't think broken is the right term, but it surely lacks optimization
on huge filesystems when compared to snapshots on a netapp filer (5
seconds on a terabyte volume for example).

For medium size fs like the 80B I'm using here on my desktop box,
approximately 30 seconds seems reasonable to me. Shorter time would be
welcome, sure ;)

�ric Masson

-- 
 Contresens. Le contenu de la signature doit respecter la charte du NG
 sur *tous* les sujets. Aussi bien la pub que la Netiquette. C'est pas
 une zone de non-droit, les 4 lignes de signature.
 -+- Lapin in <www.le-gnu.net> :  Oui-Oui casque bleu � Neuneuland -+-

From owner-freebsd-fs@FreeBSD.ORG  Fri Jul 22 16:52:26 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B0D1F16A42F
	for <freebsd-fs@freebsd.org>; Fri, 22 Jul 2005 16:52:26 +0000 (GMT)
	(envelope-from cdillon@wolves.k12.mo.us)
Received: from mail.wolves.k12.mo.us (mail.wolves.k12.mo.us [207.160.214.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id EF2EF43F71
	for <freebsd-fs@freebsd.org>; Fri, 22 Jul 2005 16:25:47 +0000 (GMT)
	(envelope-from cdillon@wolves.k12.mo.us)
Received: from localhost (localhost [127.0.0.1])
	by mail.wolves.k12.mo.us (Postfix) with ESMTP id 4C3AC1FE04;
	Fri, 22 Jul 2005 11:25:47 -0500 (CDT)
Received: from mail.wolves.k12.mo.us ([127.0.0.1])
	by localhost (mail.wolves.k12.mo.us [127.0.0.1]) (amavisd-new,
	port 10024)
	with LMTP id 24737-03-8; Fri, 22 Jul 2005 11:25:40 -0500 (CDT)
Received: by mail.wolves.k12.mo.us (Postfix, from userid 1001)
	id D68371FE03; Fri, 22 Jul 2005 11:25:40 -0500 (CDT)
Received: from localhost (localhost [127.0.0.1])
	by mail.wolves.k12.mo.us (Postfix) with ESMTP id D4F201A902;
	Fri, 22 Jul 2005 11:25:40 -0500 (CDT)
Date: Fri, 22 Jul 2005 11:25:40 -0500 (CDT)
From: Chris Dillon <cdillon@wolves.k12.mo.us>
To: Eric Anderson <anderson@centtech.com>
In-Reply-To: <42E0E39E.8020009@centtech.com>
Message-ID: <20050722111015.P25012@duey.wolves.k12.mo.us>
References: <42DD64AB.3000605@centtech.com>
	<20050720094830.GR782@marvin.riggiland.au>
	<42DE3C1F.9070704@centtech.com>
	<20050720130523.GT782@marvin.riggiland.au>
	<42E0E39E.8020009@centtech.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Virus-Scanned: amavisd-new at wolves.k12.mo.us
Cc: freebsd-fs@freebsd.org
Subject: Re: mksnap_ffs takes 4-5 minutes?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jul 2005 16:52:26 -0000

On Fri, 22 Jul 2005, Eric Anderson wrote:

> The snap seemed to suspend the filesystem nearly immediately, and 
> kept it suspended for quite some time - I would say probably more 
> than half the time. In order for snapshots to be very useful, it 
> must work on large filesystems (100GB+) in a reasonable amount of 
> time (a few seconds would be ok).  I know for certain that one test 
> filesystem (2Tb) had nothing on it, no processess using the 
> filesystem at all, and it took well over an hour to run mksnap on 
> it.

Just another datapoint -- I take daily snapshots of a 270GB filesystem 
and it takes 3 to 4 minutes (not sure down to the second).  I used to 
take multiple snapshots during the day but suspending the filesystem 
for several minutes at peak times wasn't working out (and seemed to 
cause complete system hangs, sometimes), so I just went to once per 
day during off-hours.  Making the snapshots seems to be mostly I/O 
bound, and this is on a system with a fairly fast RAID5 array of 
10KRPM SCSI drives.  The suspension of the filesystem also seems 
immediate to me, so it seems most of the time is actually spent making 
the snapshot.

Filesystem          1K-blocks     Used     Avail Capacity iused   ifree %iused  Mounted on
/dev/da1s1a         272092768 16863968 233461380     7%  155146 8435700    2%   /userspace

Jul 20 00:14:11 rshome root: snapshot: daily.0 snapshot on filesystem /userspace made (duration: 4 min)
Jul 21 00:14:01 rshome root: snapshot: daily.0 snapshot on filesystem /userspace made (duration: 3 min)
Jul 22 00:14:04 rshome root: snapshot: daily.0 snapshot on filesystem /userspace made (duration: 4 min)

I'm using Ralf S. Engelschall's snapshot management scripts, though I 
know that has no effect on the time it takes to create a snapshot.


-- 
  Chris Dillon - cdillon(at)wolves.k12.mo.us
  FreeBSD: The fastest, most open, and most stable OS on the planet
  - Available for IA32, IA64, AMD64, PC98, Alpha, and UltraSPARC architectures
  - PowerPC, ARM, MIPS, and S/390 under development
  - http://www.freebsd.org

Q: Because it reverses the logical flow of conversation.
A: Why is putting a reply at the top of the message frowned upon?