Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Sep 2004 15:45:26 -0400
From:      David Schultz <das@FreeBSD.ORG>
To:        Frank Knobbe <frank@knobbe.us>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: ZFS
Message-ID:  <20040916194526.GA3364@VARK.homeunix.com>
In-Reply-To: <1095355201.530.14.camel@localhost>
References:  <41483C97.2030303@fer.hr> <Pine.LNX.4.60.0409151047230.21034@athena> <Pine.GSO.4.61.0409161010020.29724@mail.ilrt.bris.ac.uk> <Pine.GSO.4.61.0409161528520.29724@mail.ilrt.bris.ac.uk> <Pine.LNX.4.60.0409161040480.28550@athena> <20040916151216.GB29643@SDF.LONESTAR.ORG> <20040916162030.GK1047@empiric.icir.org> <1095355201.530.14.camel@localhost>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 16, 2004, Frank Knobbe wrote:
> On Thu, 2004-09-16 at 11:20, Bruce M Simpson wrote:
> > On Thu, Sep 16, 2004 at 11:12:16AM -0400, Kevin A. Pieckiel wrote:
> > > Where on earth would you find a disk system that can store 2^64 bytes of
> > > data or larger, anyway? 
> > 
> > You can bet that somebody, somewhere, needs this right now. And someone
> > will definitely need it in the next 5-10 years.
> 
> Naahh... there is No Such Application for it.  ;)

Actually, there are a number of parties---banks, governments,
geneticists, and Internet search engines, for instance---who
never seem to have enough storage.

I've seen lots of FUD and bad math on this thread, so let's do a
quick back-of-the-envelope calculation.  Hitachi and other storage
vendors already ship systems with on the order of 1 petabyte
(2^50B) of capacity.  That's 14 doublings away from 2^64.  Storage
capacity has increased at 60% per year since 1991, so if history
is any indicator[1], capacity will continue to double every 18
months.  Ergo, 64-bit byte addresses won't be enough in 21 more
years.  (Other estimates are even shorter.)  UFS is about two
decades old, so Sun's design is at least plausible on technical
grounds[2].  Moreover, the percentage of disk bandwidth that is
typically dedicated to updating filesystem metadata is small, so
the cost of the larger pointers is nominal.  Note that I'm not
arguing that 128-bit block numbers are the best choice; I'm merely
trying to convince you that they are a sensible idea.

Anyway, out of all the features of ZFS, support for 128-bit block
numbers is among the least interesting, both from an engineering
perspective and from the user's perspective.  I don't know why
everyone is so eager to discuss them.  Much more interesting, for
instance, is the pooled storage model for volume management.
(Basically, you tell the system, ``I have a bunch of disks with
similar QoS characteristics, and I want N filesystems on top of
them.''  ZFS then dynamically shares the pool of storage among the
filesystems.  It's amazing how much trouble this saves.)


[1] The rate of increase is not very predictable in the short
    term.  It was pretty slow in the early 90's, then picked up
    with the introduction of GMR, and is now starting to slow
    down again.

[2] Of course, you can buy yourself another decade or so by using
    128-bit byte and sector addresses and 64-bit block addresses.
    UFS1 employs that strategy to squeeze as much as possible out
    of 32 bits, but the result isn't pretty.  And for various
    reasons, that trick isn't as helpful for ZFS.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040916194526.GA3364>