From owner-freebsd-hackers Mon Feb 5 9:51:52 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from mercury.ccmr.cornell.edu (mercury.ccmr.cornell.edu [128.84.231.97]) by hub.freebsd.org (Postfix) with ESMTP id C141B37B684; Mon, 5 Feb 2001 09:51:28 -0800 (PST) Received: from ruby.ccmr.cornell.edu (IDENT:0@ruby.ccmr.cornell.edu [128.84.231.115]) by mercury.ccmr.cornell.edu (8.9.3/8.9.3) with ESMTP id MAA17009; Mon, 5 Feb 2001 12:51:28 -0500 Received: from localhost (mitch@localhost) by ruby.ccmr.cornell.edu (8.9.3/8.9.3) with ESMTP id MAA06978; Mon, 5 Feb 2001 12:51:26 -0500 X-Authentication-Warning: ruby.ccmr.cornell.edu: mitch owned process doing -bs Date: Mon, 5 Feb 2001 12:51:26 -0500 (EST) From: Mitch Collinsworth To: "Michael C . Wu" Cc: hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: Extremely large (70TB) File system/server planning In-Reply-To: <20010205112420.A98288@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Mon, 5 Feb 2001, Michael C . Wu wrote: > On Mon, Feb 05, 2001 at 11:47:58AM -0500, Mitch Collinsworth scribbled: > | On Mon, 5 Feb 2001, Michael C . Wu wrote: > | > On Mon, Feb 05, 2001 at 10:39:02AM -0500, Mitch Collinsworth scribbled: > | > | You didn't say what applications this thing is going to support. > | > | That does matter. A lot. One thing worth looking at is AFS, > | > | or maybe MR-AFS. And now OpenAFS. > | > > | > He has database(s) of graphics simulation results. i.e. large files that > | > are largely unrelated to each other. Compression is not an option. > | > > | > The files are accessed approximately 3 or 4 times a day on average. > | > Older files are archived for reference purpose and may never > | > be accessed after a week. > | > | Ok, this is a start. Now is the 70 TB the size of the active files? > | Or does that also include the older archived files that may never be > | accessed again? > 70TB is the size of the sum of all files, access or no access. > (They still want to maintain accessibility even though the chances are slim.) Ok, well the next question to look at is how do they define "maintain accessibility". In other words what do they consider acceptable? Accessible in 5 seconds, accessible in 1 minute, accessible in 10 minutes, accessible in 1 hour, accessible overnight? 70 TB, as you have already noticed, is no simple feat to accomplish. No matter how you slice it it's going to cost $$. Different levels of accessibility requirement for the archived data can be accomplished with differing technologies and at differing costs. You could rough out a plan for keeping the whole thing online and spinning for instant access and then compare the costs of that with various options that keep the hot data online and archive the rest in varying ways that allow for differing speed of access. Maybe you can archive old data on CDs or tapes. Perhaps keep more recent archives "online" in a jukebox where they are fairly quickly accessible, while older archives are on a rack where someone has to retrieve them as needed. The real question here is: are they really willing to spend what it would take to keep an archive of this size spinning, including systems programmers and administrators? Or are they willing to spend less and have it take a bit longer to get access to the older data? -Mitch To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message