Date: Thu, 22 Oct 1998 16:04:59 -0700 From: Mike Smith <mike@smith.net.au> To: "Hallam Oaks" <mlnn4@oaks.com.au> Cc: "freebsd-chat@FreeBSD.ORG" <freebsd-hackers@FreeBSD.ORG> Subject: Re: Multi-terabyte disk farm Message-ID: <199810222304.QAA01620@dingo.cdrom.com> In-Reply-To: Your message of "Thu, 22 Oct 1998 23:19:28 %2B1000." <199810221320.XAA07482@mail.aussie.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> > Prior to my arrival two weeks ago, they had been planning to use a > StorageTek Timberwolf library. This box has two DLT drives, a robot arm, > and about 8 terabytes worth of tape slots. > > The requirement is for 24/7 availability, but speed is not an issue. > Provided a dub can be retrieved onto a cache disk within an hour or so of > it being requested from archive they'd be happy. Latency is less of an issue than throughput; I presume the worst-case performance of this solution was taken into account? (load/unload for every dub, seek time, etc.) > With current hard drive prices, I estimate we can put together a one > terabyte disk farm for about US$60k (cost of media only), spread across > several machines using hot swappable drive bays and dual SCSI buses per > machine. We don't intend to use RAID unless there's strong advantages to > it (I don't know a whole lot about it). We don't need striping or > replication. I would be inclined to go with RAID-4 or -5, in order to deal with disk failures in a relatively sensible fashion. It sounds like throughput is actually not very significant, so you could get away with fairly low-end RAID controllers in a low-cost lower-performance configuration. eg. one wide SCSI bus with 15 CMD CRD-5440's gives you 45 SCSI busses and 675 disks. Using 18GB SCSI disks this takes you to 12TB raw capacity, probably a bit under 10GB usable. You may run into cabling density problems, and the single core SCSI bus does give you a single point of failure, but it should give you some feel for the density that's achievable. > One advantage of doing it via a distributed farm (I theorise) is that if > one drive fries or one machine self-destructs, at least the rest of the > system will still be working. A fried drive can be restored from one of > the 25gb AIT backup tapes made of all the dubs. This is actually where using RAID controllers and hot-swap disk arrays will save you enormously; you get notification of disk failures but all you have to do is pull the dead one and put a replacement in, like changing a lightbulb. > Secondly (and this is a major call I'm making), it won't work out cheaper > for our estimated need of three terabytes unless the cost of HDD's keep on > falling. We won't need full capacity until about two years has passed, > meaning that we can start out with only a few hundred gig and scale it as > we go, taking advantage of (hopefully) falling disk prices and increasing > drive sizes. Because bus space is relatively cheap, you can start with smaller drives which are closer to the optimal point on the price/performance curve. > My desire is to push for the farm because I believe it's better. (Going > with the TW would actually be more profitable for me since one of the main > reasons they hired me was to write the software to drive the flipping > thing:). The farm approach using RAID is more reliable. It would also scale much better, and because individual components are easily replaceable (and cheap) your cost of maintenance is likely to be lower. For the TW, you have to factor the maintenance contract and (depending on your downtime profile) having a complete hot spare plus a set of hot spare tapes. > Needless to say I'm going to put my preferred solution as FreeBSD-based. > Some of the criteria that I can't yet answer and would like feedback on > are these - > > o is FreeBSD able to be made to recognise a new SCSI drive that wasn't > present on boot ? i.e. a new drive is plugged into the hot bays. can > it be recognised, formatted, and mounted by manual intervention ? Yes; if you were using eg. the CMD controllers you could add new disks to build a new array, then bring it online without taking the system down. > o ditto if a drive fries. can it be taken out without the kernel getting > too upset ? If it held a mounted filesystem, no. This is a major argument for using a RAID solution either in software (eg. vinum) or hardware. > o is it feasable to automatically umount and spin down drives that > haven't been accessed for a day or so ? typically, the older data > (> 6 months) will be rarely, if ever, accessed before its two-year > span expires and it's erased. You can use the automounter to automatically mount/unmount filesystems as they're accessed. You'd want to talk to the RAID controller vendor about whether their controller will spin idle disks down (you may have to explicitly command the controller to spind them down). > o would the boot time of a system be dramatically prolonged by it having > 500 or so gigabytes of SCSI drives hanging off its backside ? (I'm > referring to a normal boot, with the drives having been properly > unmounted. I don't even want to THINK about waiting on an fsck of 500 > gigs of unclean disk ;). OTOH the size of the files is quite large, so > it'd be feasable to use huge nodes. You would typically mount the filesystems read-only, so they'd never get dirty, so this isn't really an issue. Using large nodes wouldn't be much of a space saving really. > I've no particular objection to a reboot if need be to add/remove a > drive, but if it took more than, say, 10 minutes, it'd be an issue I'd > have to tackle with management. > > o we're thinking of using Seagate Elite 47gb drives. These are 5400 RPM > units (speed isn't an issue to us). Does anyone have any opinions > about these (good/bad/indifferent) or of previous members of that > drive family ? Don't. If speed isn't an issue, go for 5400rpm drives; these are also cheaper. Right now, 5400rpm and around 4GB seems to be the price point. > o does anyone have an opinion as to whether it's safe to assume that > drive prices will continue to fall as they have done over the past > two years ? Has anyone ever predicted anything safely? 8) Good luck; let us know how you get on! -- \\ Sometimes you're ahead, \\ Mike Smith \\ sometimes you're behind. \\ mike@smith.net.au \\ The race is long, and in the \\ msmith@freebsd.org \\ end it's only with yourself. \\ msmith@cdrom.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199810222304.QAA01620>
