From owner-freebsd-hackers  Wed Mar  4 12:28:35 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id MAA18171
          for freebsd-hackers-outgoing; Wed, 4 Mar 1998 12:28:35 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from sendero.simon-shapiro.org (sendero-fxp0.Simon-Shapiro.ORG [206.190.148.34])
          by hub.freebsd.org (8.8.8/8.8.8) with SMTP id MAA18012
          for <hackers@FreeBSD.ORG>; Wed, 4 Mar 1998 12:28:12 -0800 (PST)
          (envelope-from shimon@sendero-fxp0.simon-shapiro.org)
Received: (qmail 10025 invoked by uid 1000); 4 Mar 1998 20:34:56 -0000
Message-ID: <XFMail.980304123456.shimon@simon-shapiro.org>
X-Mailer: XFMail 1.3-alpha-021598 [p0] on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <199803041829.TAA01281@yedi.iaf.nl>
Date: Wed, 04 Mar 1998 12:34:56 -0800 (PST)
Reply-To: shimon@simon-shapiro.org
Organization: The Simon Shapiro Foundation
From: Simon Shapiro <shimon@simon-shapiro.org>
To: Wilko Bulte <wilko@yedi.iaf.nl>
Subject: Re: SCSI Bus redundancy...
Cc: grog@lemis.com, hackers@FreeBSD.ORG, blkirk@float.eli.net, jdn@acp.qiv.com,
        tlambert@primenet.com, sbabkin@dcn.att.com
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On 04-Mar-98 Wilko Bulte wrote:
 
...

>> Where does that leave kernel RAID?  I like controller level RAID
>> because:
>> 
>> a.  Much more flexible in packaging;  I can use of-the shelf disks in
>>     off-the-shelf cases if I choose to).
> 
> Assuming *good* drives, with *good* firmware. This is as you know not as
> obvious as it sounds ;-)

Of course not.  But moving the logic into the kernel will not solve it.  I
have always had good success with dedicated controllers.  The CMD box, a
DPT controller typically work very well.  The only disapointment (at the
time) was Mylex. But that probably changed since.

>> b.  In the case of a DPT, you get better performance and better
>>     reliability, as I have three busses to spread the I/O across, and
>>     three
>>     busses to take fatal failures on.
> 
> Yep. Apart from that customer that had a 3 channel Mylex but used only
> one
> to attach drives to. Wanted to save on the hot-plug case for the drives.
> Well, never mind... You can guess what has happened. 3 channel is the
> bare minimum IMO.

The numbers are simple, and can be easily derived from the SCSI specs
(Justin can correct me where I am off base here), but a SCSI bus is good
for 400-600 TPS, a drive is good for about that manym and about 4-6MB/Sec,
the BUS is not good for much more.  If you play with latencies, you arrive
at 4-6 drives per bus.  PCI-memory is good for 120MB/Sec on a sunny day, on
a P6-200 FreeBSD (UP) is good for about 2,000 interrupts/Sec.  The eventual
throughput is derived from these numbers.

 ...

> ? I don't quite follow you I think. We *still* do RAID to avoid service
> disruption.

Yes.  But service will be disrupted from O/S and application crashes many
times more.  Whe disk packs were manually loaded, etc. a RAID actually
contributed significantly to uptime.  Today we do it to reduce the damage
WHEN there is a failure, not as much to prevent the failure.

This is where my work in HA comes in.  It provides a ``CPU RAID'' at the
service level. A Traditional FT does it at the instruction level.  FreeBSD
is not a good candidate for that.  I also think that instruction level
redundency is excessive for most applications FreeBSD is fit for.  But
having the service continually available can be a boon to its popularity in
certain circles.

I think we need to look in this direction as NT is starting to offer some
such functionality, and we compete with NT.  Let Linux compete with
Win9{5,8}.  there is overlap between NT and W95.  There is (even
more) overlap between Linux and FreeBSD, but the ``market'' differentiation
is there nonetheless.


>> I think the focus changed from operational feature to insurance policy. 

> Like going bankrupt or collide in midair in case of an aircraft tracking
> system.

Yes. These two examples are very good.  They are all about recovery time.
Computers fail.  A true FT will detect and correct it at the instruction
level (almost or exactly).  This is crucial for the control surfaces in a
fly-by-wire airplane.  A financial transaction can tolerate a second, or
seconds lapse in service during an error detection/correction, as long as
the logical database is still coherent.  My HA model guarantees the second
and does absolutely nothing for the first.

Simon


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message