Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Feb 2001 12:51:26 -0500 (EST)
From:      Mitch Collinsworth <mitch@ccmr.cornell.edu>
To:        "Michael C . Wu" <keichii@peorth.iteration.net>
Cc:        hackers@FreeBSD.ORG, fs@FreeBSD.ORG
Subject:   Re: Extremely large (70TB) File system/server planning
Message-ID:  <Pine.LNX.4.10.10102051238190.22516-100000@ruby.ccmr.cornell.edu>
In-Reply-To: <20010205112420.A98288@peorth.iteration.net>

next in thread | previous in thread | raw e-mail | index | archive | help


On Mon, 5 Feb 2001, Michael C . Wu wrote:

> On Mon, Feb 05, 2001 at 11:47:58AM -0500, Mitch Collinsworth scribbled:
> | On Mon, 5 Feb 2001, Michael C . Wu wrote:
> | > On Mon, Feb 05, 2001 at 10:39:02AM -0500, Mitch Collinsworth scribbled:
> | > | You didn't say what applications this thing is going to support.
> | > | That does matter.  A lot.  One thing worth looking at is AFS,
> | > | or maybe MR-AFS.  And now OpenAFS.
> | > 
> | > He has database(s) of graphics simulation results. i.e. large files that
> | > are largely unrelated to each other.  Compression is not an option.
> | > 
> | > The files are accessed approximately 3 or 4 times a day on average.
> | > Older files are archived for reference purpose and may never
> | > be accessed after a week.
> | 
> | Ok, this is a start.  Now is the 70 TB the size of the active files?
> | Or does that also include the older archived files that may never be
> | accessed again?
> 70TB is the size of the sum of all files, access or no access.
> (They still want to maintain accessibility even though the chances are slim.)

Ok, well the next question to look at is how do they define "maintain
accessibility".  In other words what do they consider acceptable?
Accessible in 5 seconds, accessible in 1 minute, accessible in 10
minutes, accessible in 1 hour, accessible overnight?

70 TB, as you have already noticed, is no simple feat to accomplish.
No matter how you slice it it's going to cost $$.  Different levels
of accessibility requirement for the archived data can be accomplished
with differing technologies and at differing costs.

You could rough out a plan for keeping the whole thing online and
spinning for instant access and then compare the costs of that with
various options that keep the hot data online and archive the rest
in varying ways that allow for differing speed of access.  Maybe you
can archive old data on CDs or tapes.  Perhaps keep more recent
archives "online" in a jukebox where they are fairly quickly
accessible, while older archives are on a rack where someone has to
retrieve them as needed.

The real question here is: are they really willing to spend what it
would take to keep an archive of this size spinning, including
systems programmers and administrators?  Or are they willing to
spend less and have it take a bit longer to get access to the older
data?

-Mitch



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.10.10102051238190.22516-100000>