From owner-freebsd-scsi  Sun May 31 07:30:54 1998
Return-Path: <owner-freebsd-scsi@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id HAA28196
          for freebsd-scsi-outgoing; Sun, 31 May 1998 07:30:54 -0700 (PDT)
          (envelope-from owner-freebsd-scsi@FreeBSD.ORG)
Received: from ns1.dpt.com (root@ns1.dpt.com [206.138.241.7])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id HAA28041
          for <freebsd-scsi@freebsd.org>; Sun, 31 May 1998 07:29:19 -0700 (PDT)
          (envelope-from mark@bohica.net)
Received: from deathstar.deathstar.dpt.com (salyzyn-mark.dpt.com [198.242.63.87])
	by ns1.dpt.com (8.8.7/8.8.7) with SMTP id KAA18923;
	Sun, 31 May 1998 10:32:04 -0400
Received: by deathstar.deathstar.dpt.com [198.242.63.87] (NX5.67g/NX3.0M)
	id AA11443; Sun, 31 May 98 10:28:06 -0400
Reply-To: Mark Gregory Salyzyn <mark@bohica.net>
Message-Id: <9805311428.AA11443@deathstar.deathstar.dpt.com>
Mime-Version: 1.0 (NeXT Mail 4.2mach v148)
Content-Type: multipart/alternative; boundary=NeXT-Mail-2114881306-1
Content-Transfer-Encoding: 7bit
Received: by NeXT.Mailer (1.148)
From: Mark Gregory Salyzyn <mark@bohica.net>
Date: Sun, 31 May 98 10:28:02 -0400
To: freebsd-scsi@FreeBSD.ORG, shimon@simon-shapiro.org, tcobb@staff.circle.net
Subject: Re: DPT Redux
Sender: owner-freebsd-scsi@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

--NeXT-Mail-2114881306-1
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Trou Cobb <tcobb@staff.circle.net> writes:

> With RAID-5 and a new drive to rebuild on, the DPT hardware begins  
automatic rebuilds of the
> array.  However, in these conditions the DPT driver (or other FreeBSD  
component) does not
> correctly sense the size information and panics the kernel during  
bootup.  This symptom goes
> away after the rebuild is complete.  This symptom does not appear when  
in DOS under the
> same circumstances.  DOS DPTmgr checks show the array of the correct  
size.  BIOS bootup
> screen for DPT shows the array of the correct size.

There is not supposed to be any difference to the command access to the  
DPT controller
whether optimal, degraded or rebuilding.

However, the following realities exist:

1) The Logical Array Status page will indicate the appropriate status of  
the array and
   it's components. This goes without saying, and has no operational affects.
2) Accesses to a degraded or rebuilding array are significantly slower.
3) A controller performing a rebuild uses some of the available CCBs to  
perform it's
   duties, *possibly* exausting the available CCBs

I believe that Simon's driver works under condition 2, simply because the  
driver functions
under the stress of a degraded array. Rebuilding, if the user has selected  
a particularily
aggressive scheduling, is more extreme and may cause timeouts. However,  
ReadCapacity calls
will not fail as they are not delayed. I believe a failed ReadCapacity  
SCSI command is the
reason for the problem.

Condition 3, I believe, is the more likely cause, and is a direct result  
of the number of
components within the array and a possible driver assumption that 64 CCBs  
are always available
for commands under this condition.

Externally, the DPT controller fibs, and indicates that there are 64  
concurrent CCBs available.
Internally, the controller has 128 CCBs available to process commands.  
This is to allow the
single access commands to be spread into several internal SCSI device  
commands to the components
of the array. This is also to buffer a rebuild which utilizes these  
(possibly scarce) CCB
resources. Lacking some theory, I'd say these CCBs are being used up by  
the combination of
an aggressive rebuild, low amount of controller board cache and concurrent  
RAID operations.
The driver is affected by the above because it is apparently denied  
issuing a CCB because the
board is `busy' for far too long a period.

I suggest the following:

1) Troy Cobb, please adjust the various rebuild parameters to support a
   considerably less aggresive RAID rebuild. Yes, it will take longer, but  
then your seven
   drive RAID array will not be exhausting the CCB resources of the  
controller preventing the
   lockup of the system as it waits for the controller to no longer be  
busy. An aggresive
   schedule (defaults may be inappropriate for a seven drive RAID-5) may  
even lock up all accesses
   to the controller, which is not the case since Storage Managers  
single-command-at-a-time-I-
   can-wait-forever-to-get-a-command-to-the-controller approach to things  
appears to work.
   I do not have dptmgr handy to indicate the exact parameters you need to  
adjust, but I
   believe they are associated with the drive cache parameters as well.

   I question the rapid number of drive failures you have experienced, is  
there a possibility
   that your SCSI bus is *now* having troubles and is affecting the  
appropriate operation of the
   rebuild, which heavily utilizes the SCSI bus? You may have had  
incorrect drive termination,
   and aging of *all* the components may have moved you from working to  
marginally working? Any
   untoward kinks or nicks now added to the SCSI cable? Or, maybe the SCSI  
cable to the 7 drives
   is a bit too long? DPT recommends using a SCSI backplane with more than  
4 drives rather than
   cables.

   How much Cache is on the DPT controller board, this affects background  
(rebuild and parity
   writeback) accesses to the SCSI devices and may be limiting the  
performance of the system
   significantly so as to make the driver and OS's life miserable ...

   Would you consider dividing the RAID-5 into (I know, capacity robbing)  
two arrays, or
   dividing them up amoungst two SCSI busses (three on BUS 0 where you  
have the RAID-0, and
   four on BUS1 where you have the hotspare) to help improve the  
performance by dividing
   the SCSI activity amoungst the busses?

2) Simon, you may want to consider what happens when the controller is  
indicating busy, do you
   perform a timeout on the busy bit of the auxilliary status register,  
and if you do, what do
   you show to the OS (failed command? spawn a separate command issue  
thread to try again later?
   spin forever waiting for ready?).

   The BSDi BSD/OS driver, for example, simply `locks' waiting for the  
controller to get out of
   busy, which is the simplest approach to deal with, what should be a  
transitory situation. Also,
   you may wish to limit the number of outstanding commands to the  
controller (the UNIXWARE driver
   uses the lock on wait, and 32 CCB limit to reduce the chances of this  
problem affecting
   performance). The highest performance DPT driver in a Networking  
operating system (NETWARE)
   does the `spawn an issuing task' approach to allow processing of  
network card interrupts while
   waiting for the controller to come free. This may be your best approach  
considering you will
   no doubt be issuing `next' commands to the controller while in the  
context of the controller
   interrupt service routine.

   My assumption is that you timeout and send a fail up to the OS, which  
may explain the 0MB
   read capacity result shown in the log?

I hope this helps -- Sincerely -- Mark Salyzyn

--NeXT-Mail-2114881306-1
Content-Type: text/enriched; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Trou Cobb <<tcobb@staff.circle.net> writes:


> With RAID-5 and a new drive to rebuild on, the DPT hardware begins
automatic rebuilds of the

> array.  However, in these conditions the DPT driver (or other FreeBSD
component) does not

> correctly sense the size information and panics the kernel during
bootup.  This symptom goes

> away after the rebuild is complete.  This symptom does not appear when
in DOS under the

> same circumstances.  DOS DPTmgr checks show the array of the correct
size.  BIOS bootup

> screen for DPT shows the array of the correct size.


There is not supposed to be any difference to the command access to the
DPT controller

whether optimal, degraded or rebuilding.


However, the following realities exist:


1) The Logical Array Status page will indicate the appropriate status of
the array and

   it's components. This goes without saying, and has no operational =
affects.

2) Accesses to a degraded or rebuilding array are significantly slower.

3) A controller performing a rebuild uses some of the available CCBs to
perform it's

   duties, *possibly* exausting the available CCBs


I believe that Simon's driver works under condition 2, simply because the
driver functions

under the stress of a degraded array. Rebuilding, if the user has selected
a particularily

aggressive scheduling, is more extreme and may cause timeouts. However,
ReadCapacity calls

will not fail as they are not delayed. I believe a failed ReadCapacity
SCSI command is the

reason for the problem.


Condition 3, I believe, is the more likely cause, and is a direct result
of the number of

components within the array and a possible driver assumption that 64 CCBs
are always available

for commands under this condition.


Externally, the DPT controller fibs, and indicates that there are 64
concurrent CCBs available.

Internally, the controller has 128 CCBs available to process commands.
This is to allow the

single access commands to be spread into several internal SCSI device
commands to the components

of the array. This is also to buffer a rebuild which utilizes these
(possibly scarce) CCB

resources. Lacking some theory, I'd say these CCBs are being used up by
the combination of

an aggressive rebuild, low amount of controller board cache and concurrent
RAID operations.

The driver is affected by the above because it is apparently denied
issuing a CCB because the
<nofill>
board is `busy' for far too long a period.

I suggest the following:

1) Troy Cobb, please adjust the various rebuild parameters to support a
</nofill>   considerably less aggresive RAID rebuild. Yes, it will take =
longer, but
then your seven

   drive RAID array will not be exhausting the CCB resources of the
controller preventing the

   lockup of the system as it waits for the controller to no longer be
busy. An aggresive

   schedule (defaults may be inappropriate for a seven drive RAID-5) may
even lock up all accesses

   to the controller, which is not the case since Storage Managers
single-command-at-a-time-I-

   can-wait-forever-to-get-a-command-to-the-controller approach to things
appears to work.

   I do not have dptmgr handy to indicate the exact parameters you need to
adjust, but I

   believe they are associated with the drive cache parameters as well.


   I question the rapid number of drive failures you have experienced, is
there a possibility

   that your SCSI bus is *now* having troubles and is affecting the
appropriate operation of the

   rebuild, which heavily utilizes the SCSI bus? You may have had
incorrect drive termination,

   and aging of *all* the components may have moved you from working to
marginally working? Any

   untoward kinks or nicks now added to the SCSI cable? Or, maybe the SCSI
cable to the 7 drives

   is a bit too long? DPT recommends using a SCSI backplane with more than
4 drives rather than

   cables.


   How much Cache is on the DPT controller board, this affects background
(rebuild and parity

   writeback) accesses to the SCSI devices and may be limiting the
performance of the system

   significantly so as to make the driver and OS's life miserable ...


   Would you consider dividing the RAID-5 into (I know, capacity robbing)
two arrays, or

   dividing them up amoungst two SCSI busses (three on BUS 0 where you
have the RAID-0, and

   four on BUS1 where you have the hotspare) to help improve the
performance by dividing

   the SCSI activity amoungst the busses?


2) Simon, you may want to consider what happens when the controller is
indicating busy, do you

   perform a timeout on the busy bit of the auxilliary status register,
and if you do, what do

   you show to the OS (failed command? spawn a separate command issue
thread to try again later?

   spin forever waiting for ready?).


   The BSDi BSD/OS driver, for example, simply `locks' waiting for the
controller to get out of

   busy, which is the simplest approach to deal with, what should be a
transitory situation. Also,

   you may wish to limit the number of outstanding commands to the
controller (the UNIXWARE driver

   uses the lock on wait, and 32 CCB limit to reduce the chances of this
problem affecting

   performance). The highest performance DPT driver in a Networking
operating system (NETWARE)

   does the `spawn an issuing task' approach to allow processing of
network card interrupts while

   waiting for the controller to come free. This may be your best approach
considering you will

   no doubt be issuing `next' commands to the controller while in the
context of the controller

   interrupt service routine.


   My assumption is that you timeout and send a fail up to the OS, which
may explain the 0MB

   read capacity result shown in the log?


I hope this helps -- Sincerely -- Mark Salyzyn

--NeXT-Mail-2114881306-1--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message