From owner-freebsd-hackers  Tue Mar  3 17:03:57 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id RAA09479
          for freebsd-hackers-outgoing; Tue, 3 Mar 1998 17:03:57 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from sendero.simon-shapiro.org (sendero-fxp0.Simon-Shapiro.ORG [206.190.148.34])
          by hub.freebsd.org (8.8.8/8.8.8) with SMTP id RAA09448
          for <hackers@freebsd.org>; Tue, 3 Mar 1998 17:03:16 -0800 (PST)
          (envelope-from shimon@sendero-fxp0.simon-shapiro.org)
Received: (qmail 20888 invoked by uid 1000); 4 Mar 1998 01:09:54 -0000
Message-ID: <XFMail.980303170953.shimon@simon-shapiro.org>
X-Mailer: XFMail 1.3-alpha-021598 [p0] on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <19980303183101.05201@mcs.net>
Date: Tue, 03 Mar 1998 17:09:53 -0800 (PST)
Reply-To: shimon@simon-shapiro.org
Organization: The Simon Shapiro Foundation
From: Simon Shapiro <shimon@simon-shapiro.org>
To: Karl Denninger <karl@mcs.net>
Subject: Re: SCSI Bus redundancy...
Cc: grog@lemis.com, hackers@FreeBSD.ORG, blkirk@float.eli.net, jdn@acp.qiv.com,
        tlambert@primenet.com, sbabkin@dcn.att.com,
        Wilko Bulte <wilko@yedi.iaf.nl>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On 04-Mar-98 Karl Denninger wrote:
 
...
> My CMD RAID adapters have saved my nuts twice in the last month.
> 
> In both cases there was a non-recoverable, hard sector error on a 9G
> drive.
> Without parity I would have lost something.  With the RAID5 in place I
> lost
> nothing, other than the time to pull the pack, replace it, and set the
> new
> disk to "warm spare" (the system had already started the rebuild onto the
> existing spare).

This is what a RAID controller should do.  Any less, junk it.

> Lose 36GB all the way back to your last full + incremental dump (at least
> a
> day's worth of revisions) across 10,000 customers and tell me what
> happens
> to your head when they get done with you.

This is a small database.  I had to deal with customers with 3,000 drives
per system and was told there are larger.

> The problem isn't even necessarily the data loss - its the restore time. 
> A
> 9G drive takes a shitload of time to reload from even the fastest DLT
> drive.
> 
> We still run tapes nightly for incrementals, and weekly for full dumps -
> but
> they are more for the "aw shit" user-induced stupidity (like the infamous
> "rm -rf *") rather than hardware coverage.  The pain of a restore across
> disks of this size is just too darn big.

I wrote a white paper at Oracle some years ago, claiming that databases
over a certain size simply cannot be backed up.  I became very UN-popular
very quickly.  In you moderate setup, you already see the proof of
corectness.

This is why most MIS types shiver when they hear about databases on Unix
filesystems.  All you need is a crash and fsck in a bad mood.  If you are
lucky, the entire data base is gone.  If you are unlucky, a block will
disappear form somewhere in the middle, and you will find out a week later.
Now backup is literally useless.

> This is, by the way, one of the reasons I used to favor lots of 1G drives
> and filesystems - they can be restored in an hour or so if one fails. 
> With
> a 9G drive, even the newest and fastest ones, and the best tape devices,
> you're looking at a multi-hour outage.

True.  Your perfromance also goes up with the smaller drives.  You can
stripe better.  I think I mentioned it before in this forum;  Most DBMS
benchmarks only use 300MB of the disk.  This is sort of the ``sweet spot''
between system cost and perfrormance.


----------


Sincerely Yours, 

Simon Shapiro
Shimon@Simon-Shapiro.ORG                      Voice:   503.799.2313

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message