From owner-freebsd-questions@FreeBSD.ORG Thu Oct 20 21:23:20 2005 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D35CD16A41F for ; Thu, 20 Oct 2005 21:23:20 +0000 (GMT) (envelope-from gayn.winters@bristolsystems.com) Received: from bristolsystems.com (h-68-167-239-98.lsanca54.covad.net [68.167.239.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7CDBA43D68 for ; Thu, 20 Oct 2005 21:23:20 +0000 (GMT) (envelope-from gayn.winters@bristolsystems.com) Received: from workdog ([192.168.1.201]) by bristolsystems.com (8.11.6/8.11.6) with ESMTP id j9KLNEn10535; Thu, 20 Oct 2005 14:23:14 -0700 From: "Gayn Winters" To: "'user'" Date: Thu, 20 Oct 2005 14:22:26 -0700 Message-ID: <004701c5d5bc$79cfb2f0$c901a8c0@workdog> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4024 In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal Cc: "'Andrew P.'" , freebsd-questions@freebsd.org Subject: RE: FreeBSD UFS2 snapshots, and math ... - resolved, but two more Qs X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gayn.winters@bristolsystems.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Oct 2005 21:23:21 -0000 > -----Original Message----- > From: user [mailto:user@dhp.com] > Sent: Thursday, October 20, 2005 1:51 PM > To: Gayn Winters > Cc: 'Andrew P.'; freebsd-questions@freebsd.org > Subject: RE: FreeBSD UFS2 snapshots, and math ... - resolved, > but two more Qs > > > > Folks, > > On Thu, 20 Oct 2005, Gayn Winters wrote: > > > > Imagine that each data block is marked with labels > > > on change. It doesn't matter how many labels there > > > are, there will be only one data block saved. > > > > In trying to follow this thread, I started looking around > for a precise > > definition of snapshot. > > Man mksnap_ffs > > wasn't too helpful, and googling for "snapshot" etc. wasn't > fruitful. > > I'm guessing that the original author of the thread (user > at dhp.com) > > may also need such a definition. Can someone provide a pointer to a > > specification or at least an RFC-like paper? > > > I found one: > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/ufs/ffs/README.s > napshot?rev=1.4 > > and further, I did some tests and discovered that what I was > being told > (by you folks) was indeed correct. > > No matter how many snapshots you have, the changes in blocks since the > tiem before the first snapshot is only recorded in one of > them. That is > to say, if I do the following: > > - create 4 1gig /dev/zero filled files > - create a snapshot > - overwrite one of those 1gig files with /dev/random > > My free space will have decreased by 1gig. So far so good. > > If I then: > > - create a second snapshot > - overwrite a different 1gig file with /dev/random > > My free space merely decreases by another 1gig. It makes > sense to me now > because it has occurred to me that since the second file had > not changed > between the creation of the first and second snapshot, there > is no reason > for _both_ snapshots to _both_ say "this 1gig random file used to be > filled with zeros" - it would be redundant. > > So that's great ... but I am curious, how do they know ? I think my > previous assumption (that the first _and_ the second snapshot > file would > _both_ have to record the change of file #2 from zero to > random) was based > on the notion that these snapshot files were totally autonomous and > independent, and had no general organization behind them. If > that was the > case, then I am still fairly certain both snapshots would > need to record > the change of the second file. > > So what is the behind the scenes organization that makes it > possible for > the snapshot files to not duplicate data like that ? > > ALSO, > > I have noticed that if you: > > - dd 1gig /dev/zero file > - create snapshot > - overwrite that 1gig file with /dev/random > > (free space decreases by 1gig, as expected) > > - rewrite that 1gig file with /dev/zero again > > You _don't_ get that 1gig of free space back ... which > surprises me, since > it was all zeros before, and its all zeros now ... how does > the snapshot > know those are "different zeros" ? And what ramifications > does this have > for restoring, etc., if identical files do not get counted as > identical in > the snapshot ? > > thanks. > I just finished skimming an old paper by McKusick on Soft Updates: http://www.usenix.org/publications/library/proceedings/usenix99/full_pap ers/mckusick/mckusick.pdf This paper is dated 1999. Does anyone know if it accurately reflects how soft updates and snapshots in FreeBSD 5.4 are implemented? If so, it would answer the above questions. -gayn