From owner-freebsd-fs@FreeBSD.ORG  Fri Jan  1 23:21:49 2010
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 925D9106566B
	for <freebsd-fs@freebsd.org>; Fri,  1 Jan 2010 23:21:49 +0000 (UTC)
	(envelope-from jroberson@jroberson.net)
Received: from mail-gx0-f218.google.com (mail-gx0-f218.google.com
	[209.85.217.218])
	by mx1.freebsd.org (Postfix) with ESMTP id 5394C8FC19
	for <freebsd-fs@freebsd.org>; Fri,  1 Jan 2010 23:21:49 +0000 (UTC)
Received: by gxk10 with SMTP id 10so13024146gxk.3
	for <freebsd-fs@freebsd.org>; Fri, 01 Jan 2010 15:21:37 -0800 (PST)
Received: by 10.91.164.22 with SMTP id r22mr4101321ago.64.1262388090012;
	Fri, 01 Jan 2010 15:21:30 -0800 (PST)
Received: from ?10.0.1.198? (udp022762uds.hawaiiantel.net [72.234.79.107])
	by mx.google.com with ESMTPS id 14sm7839679gxk.6.2010.01.01.15.21.27
	(version=SSLv3 cipher=RC4-MD5); Fri, 01 Jan 2010 15:21:29 -0800 (PST)
Date: Fri, 1 Jan 2010 13:23:35 -1000 (HST)
From: Jeff Roberson <jroberson@jroberson.net>
X-X-Sender: jroberson@desktop
To: Ronald Klop <ronald-freebsd8@klop.yi.org>
In-Reply-To: <op.u5o4f4px8527sy@82-170-177-25.ip.telfort.nl>
Message-ID: <alpine.BSF.2.00.1001011320070.1027@desktop>
References: <32CA2B73-3412-49DD-9401-4773CC73BED0@patpro.net>
	<alpine.GSO.2.01.0912231031450.1586@freddy.simplesystems.org>
	<4B3283F2.7060804@barryp.org>
	<3ea87f5f62bb8ba30d798d4605a64c83@localhost>
	<op.u5o4f4px8527sy@82-170-177-25.ip.telfort.nl>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-fs@freebsd.org, patpro <patpro@patpro.net>
Subject: Re: snapshot implementation
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Jan 2010 23:21:49 -0000

On Tue, 29 Dec 2009, Ronald Klop wrote:

> On Fri, 25 Dec 2009 15:29:53 +0100, patpro <patpro@patpro.net> wrote:
>
>> 
>> On Wed, 23 Dec 2009 14:56:18 -0600, Barry Pederson <bp@barryp.org> wrote:
>>> "...there's virtually no overhead at all due to the copy-on-write
>>> architecture. In fact, sometimes it is faster to take a snapshot rather
>>> than free the blocks containing the old data!"
>>> 
>>> That's certainly not the case with UFS snapshots, which can take a long
>>> time to complete (we're talking freezing your machine's disk activity
>>> for many minutes), and are limited to 20 total.
>> 
>> 
>> UFS uses copy on write. But you say many minutes to complete? Don't you
>> speak about dump(1), that uses snapshot as a basis to dump a live file
>> system?
>> I agree, UFS snapshot creation is not lightning-fast, but many minutes
>> seems a lot to me, and I never experienced such a long creation time.
>
> As far as I know UFS snapshots need to create a list of currently in use 
> blocks. This is O(n) on the size of the FS and pauses the FS during the 
> snapshot. On large FS's this can take a long time.
> ZFS always maintains this list so it only needs to mark this list as readonly 
> to create a snapshot. This is O(1).
>

This is not quite right.  It's the copy of cg blocks that takes so long. 
cg blocks are limited in size to one filesystem block which means on very 
large drives there are quite a lot of them.

When we create a snapshot we first make a copy of all cg blocks, then we 
suspend the filesystem and sync it, and then we copy all dirtied cg 
blocks and unsuspend.  We seem to be copying an excessive number of CGs 
once suspended so there may be a bug there.  A relatively straightforward 
improvement would be to also COW the cg blocks rather than copying them in 
a seperate step.

There's no reason the COW snapshot mechanism can't be fast theoretically. 
It's just a matter of the practical implementation.

Thanks,
Jeff

> Ronald.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"