From owner-freebsd-hackers Tue Mar 3 18:02:47 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id SAA22184 for freebsd-hackers-outgoing; Tue, 3 Mar 1998 18:02:47 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from Kitten.mcs.com (Kitten.mcs.com [192.160.127.90]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id SAA21986 for ; Tue, 3 Mar 1998 18:01:56 -0800 (PST) (envelope-from karl@Mars.mcs.net) Received: from Mars.mcs.net (karl@Mars.mcs.net [192.160.127.85]) by Kitten.mcs.com (8.8.7/8.8.2) with ESMTP id SAA17269; Tue, 3 Mar 1998 18:31:02 -0600 (CST) Received: (from karl@localhost) by Mars.mcs.net (8.8.7/8.8.2) id SAA29449; Tue, 3 Mar 1998 18:31:01 -0600 (CST) Message-ID: <19980303183101.05201@mcs.net> Date: Tue, 3 Mar 1998 18:31:01 -0600 From: Karl Denninger To: shimon@simon-shapiro.org Cc: Wilko Bulte , sbabkin@dcn.att.com, tlambert@primenet.com, jdn@acp.qiv.com, blkirk@float.eli.net, hackers@FreeBSD.ORG, grog@lemis.com Subject: Re: SCSI Bus redundancy... References: <199803032155.WAA04054@yedi.iaf.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.84 In-Reply-To: ; from Simon Shapiro on Tue, Mar 03, 1998 at 04:23:24PM -0800 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Tue, Mar 03, 1998 at 04:23:24PM -0800, Simon Shapiro wrote: > I think the focus has to change: > > * We used to do RAID to protect from hardware failure disrupting service. > In the face of O/S and firmware volatility and buginess, this is absurd; > As I said, I am using DPT controllers for ALL my storage. and yet have to > loose a byte to disk failure (unless I use WD or certain Micropolis > models). > > * I think RAID is only important to protect us fro mthe damage WHEN the > failure occurs. > > I think the focus changed from operational feature to insurance policy. > Risk management is something not too many of us is any good at (count the > number of times you/I/we delivered a project on time. > > What does it all mean? I dunno. I leave it to the scientists to ponder. My CMD RAID adapters have saved my nuts twice in the last month. In both cases there was a non-recoverable, hard sector error on a 9G drive. Without parity I would have lost something. With the RAID5 in place I lost nothing, other than the time to pull the pack, replace it, and set the new disk to "warm spare" (the system had already started the rebuild onto the existing spare). Lose 36GB all the way back to your last full + incremental dump (at least a day's worth of revisions) across 10,000 customers and tell me what happens to your head when they get done with you. The problem isn't even necessarily the data loss - its the restore time. A 9G drive takes a shitload of time to reload from even the fastest DLT drive. We still run tapes nightly for incrementals, and weekly for full dumps - but they are more for the "aw shit" user-induced stupidity (like the infamous "rm -rf *") rather than hardware coverage. The pain of a restore across disks of this size is just too darn big. This is, by the way, one of the reasons I used to favor lots of 1G drives and filesystems - they can be restored in an hour or so if one fails. With a 9G drive, even the newest and fastest ones, and the best tape devices, you're looking at a multi-hour outage. -- -- Karl Denninger (karl@MCS.Net)| MCSNet - Serving Chicagoland and Wisconsin http://www.mcs.net/ | T1's from $600 monthly to FULL DS-3 Service | NEW! K56Flex support on ALL modems Voice: [+1 312 803-MCS1 x219]| EXCLUSIVE NEW FEATURE ON ALL PERSONAL ACCOUNTS Fax: [+1 312 803-4929] | *SPAMBLOCK* Technology now included at no cost To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message