From owner-freebsd-scsi Sun May 31 07:30:54 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id HAA28196 for freebsd-scsi-outgoing; Sun, 31 May 1998 07:30:54 -0700 (PDT) (envelope-from owner-freebsd-scsi@FreeBSD.ORG) Received: from ns1.dpt.com (root@ns1.dpt.com [206.138.241.7]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id HAA28041 for ; Sun, 31 May 1998 07:29:19 -0700 (PDT) (envelope-from mark@bohica.net) Received: from deathstar.deathstar.dpt.com (salyzyn-mark.dpt.com [198.242.63.87]) by ns1.dpt.com (8.8.7/8.8.7) with SMTP id KAA18923; Sun, 31 May 1998 10:32:04 -0400 Received: by deathstar.deathstar.dpt.com [198.242.63.87] (NX5.67g/NX3.0M) id AA11443; Sun, 31 May 98 10:28:06 -0400 Reply-To: Mark Gregory Salyzyn Message-Id: <9805311428.AA11443@deathstar.deathstar.dpt.com> Mime-Version: 1.0 (NeXT Mail 4.2mach v148) Content-Type: multipart/alternative; boundary=NeXT-Mail-2114881306-1 Content-Transfer-Encoding: 7bit Received: by NeXT.Mailer (1.148) From: Mark Gregory Salyzyn Date: Sun, 31 May 98 10:28:02 -0400 To: freebsd-scsi@FreeBSD.ORG, shimon@simon-shapiro.org, tcobb@staff.circle.net Subject: Re: DPT Redux Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org --NeXT-Mail-2114881306-1 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Trou Cobb writes: > With RAID-5 and a new drive to rebuild on, the DPT hardware begins automatic rebuilds of the > array. However, in these conditions the DPT driver (or other FreeBSD component) does not > correctly sense the size information and panics the kernel during bootup. This symptom goes > away after the rebuild is complete. This symptom does not appear when in DOS under the > same circumstances. DOS DPTmgr checks show the array of the correct size. BIOS bootup > screen for DPT shows the array of the correct size. There is not supposed to be any difference to the command access to the DPT controller whether optimal, degraded or rebuilding. However, the following realities exist: 1) The Logical Array Status page will indicate the appropriate status of the array and it's components. This goes without saying, and has no operational affects. 2) Accesses to a degraded or rebuilding array are significantly slower. 3) A controller performing a rebuild uses some of the available CCBs to perform it's duties, *possibly* exausting the available CCBs I believe that Simon's driver works under condition 2, simply because the driver functions under the stress of a degraded array. Rebuilding, if the user has selected a particularily aggressive scheduling, is more extreme and may cause timeouts. However, ReadCapacity calls will not fail as they are not delayed. I believe a failed ReadCapacity SCSI command is the reason for the problem. Condition 3, I believe, is the more likely cause, and is a direct result of the number of components within the array and a possible driver assumption that 64 CCBs are always available for commands under this condition. Externally, the DPT controller fibs, and indicates that there are 64 concurrent CCBs available. Internally, the controller has 128 CCBs available to process commands. This is to allow the single access commands to be spread into several internal SCSI device commands to the components of the array. This is also to buffer a rebuild which utilizes these (possibly scarce) CCB resources. Lacking some theory, I'd say these CCBs are being used up by the combination of an aggressive rebuild, low amount of controller board cache and concurrent RAID operations. The driver is affected by the above because it is apparently denied issuing a CCB because the board is `busy' for far too long a period. I suggest the following: 1) Troy Cobb, please adjust the various rebuild parameters to support a considerably less aggresive RAID rebuild. Yes, it will take longer, but then your seven drive RAID array will not be exhausting the CCB resources of the controller preventing the lockup of the system as it waits for the controller to no longer be busy. An aggresive schedule (defaults may be inappropriate for a seven drive RAID-5) may even lock up all accesses to the controller, which is not the case since Storage Managers single-command-at-a-time-I- can-wait-forever-to-get-a-command-to-the-controller approach to things appears to work. I do not have dptmgr handy to indicate the exact parameters you need to adjust, but I believe they are associated with the drive cache parameters as well. I question the rapid number of drive failures you have experienced, is there a possibility that your SCSI bus is *now* having troubles and is affecting the appropriate operation of the rebuild, which heavily utilizes the SCSI bus? You may have had incorrect drive termination, and aging of *all* the components may have moved you from working to marginally working? Any untoward kinks or nicks now added to the SCSI cable? Or, maybe the SCSI cable to the 7 drives is a bit too long? DPT recommends using a SCSI backplane with more than 4 drives rather than cables. How much Cache is on the DPT controller board, this affects background (rebuild and parity writeback) accesses to the SCSI devices and may be limiting the performance of the system significantly so as to make the driver and OS's life miserable ... Would you consider dividing the RAID-5 into (I know, capacity robbing) two arrays, or dividing them up amoungst two SCSI busses (three on BUS 0 where you have the RAID-0, and four on BUS1 where you have the hotspare) to help improve the performance by dividing the SCSI activity amoungst the busses? 2) Simon, you may want to consider what happens when the controller is indicating busy, do you perform a timeout on the busy bit of the auxilliary status register, and if you do, what do you show to the OS (failed command? spawn a separate command issue thread to try again later? spin forever waiting for ready?). The BSDi BSD/OS driver, for example, simply `locks' waiting for the controller to get out of busy, which is the simplest approach to deal with, what should be a transitory situation. Also, you may wish to limit the number of outstanding commands to the controller (the UNIXWARE driver uses the lock on wait, and 32 CCB limit to reduce the chances of this problem affecting performance). The highest performance DPT driver in a Networking operating system (NETWARE) does the `spawn an issuing task' approach to allow processing of network card interrupts while waiting for the controller to come free. This may be your best approach considering you will no doubt be issuing `next' commands to the controller while in the context of the controller interrupt service routine. My assumption is that you timeout and send a fail up to the OS, which may explain the 0MB read capacity result shown in the log? I hope this helps -- Sincerely -- Mark Salyzyn --NeXT-Mail-2114881306-1 Content-Type: text/enriched; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Trou Cobb < writes: > With RAID-5 and a new drive to rebuild on, the DPT hardware begins automatic rebuilds of the > array. However, in these conditions the DPT driver (or other FreeBSD component) does not > correctly sense the size information and panics the kernel during bootup. This symptom goes > away after the rebuild is complete. This symptom does not appear when in DOS under the > same circumstances. DOS DPTmgr checks show the array of the correct size. BIOS bootup > screen for DPT shows the array of the correct size. There is not supposed to be any difference to the command access to the DPT controller whether optimal, degraded or rebuilding. However, the following realities exist: 1) The Logical Array Status page will indicate the appropriate status of the array and it's components. This goes without saying, and has no operational = affects. 2) Accesses to a degraded or rebuilding array are significantly slower. 3) A controller performing a rebuild uses some of the available CCBs to perform it's duties, *possibly* exausting the available CCBs I believe that Simon's driver works under condition 2, simply because the driver functions under the stress of a degraded array. Rebuilding, if the user has selected a particularily aggressive scheduling, is more extreme and may cause timeouts. However, ReadCapacity calls will not fail as they are not delayed. I believe a failed ReadCapacity SCSI command is the reason for the problem. Condition 3, I believe, is the more likely cause, and is a direct result of the number of components within the array and a possible driver assumption that 64 CCBs are always available for commands under this condition. Externally, the DPT controller fibs, and indicates that there are 64 concurrent CCBs available. Internally, the controller has 128 CCBs available to process commands. This is to allow the single access commands to be spread into several internal SCSI device commands to the components of the array. This is also to buffer a rebuild which utilizes these (possibly scarce) CCB resources. Lacking some theory, I'd say these CCBs are being used up by the combination of an aggressive rebuild, low amount of controller board cache and concurrent RAID operations. The driver is affected by the above because it is apparently denied issuing a CCB because the board is `busy' for far too long a period. I suggest the following: 1) Troy Cobb, please adjust the various rebuild parameters to support a considerably less aggresive RAID rebuild. Yes, it will take = longer, but then your seven drive RAID array will not be exhausting the CCB resources of the controller preventing the lockup of the system as it waits for the controller to no longer be busy. An aggresive schedule (defaults may be inappropriate for a seven drive RAID-5) may even lock up all accesses to the controller, which is not the case since Storage Managers single-command-at-a-time-I- can-wait-forever-to-get-a-command-to-the-controller approach to things appears to work. I do not have dptmgr handy to indicate the exact parameters you need to adjust, but I believe they are associated with the drive cache parameters as well. I question the rapid number of drive failures you have experienced, is there a possibility that your SCSI bus is *now* having troubles and is affecting the appropriate operation of the rebuild, which heavily utilizes the SCSI bus? You may have had incorrect drive termination, and aging of *all* the components may have moved you from working to marginally working? Any untoward kinks or nicks now added to the SCSI cable? Or, maybe the SCSI cable to the 7 drives is a bit too long? DPT recommends using a SCSI backplane with more than 4 drives rather than cables. How much Cache is on the DPT controller board, this affects background (rebuild and parity writeback) accesses to the SCSI devices and may be limiting the performance of the system significantly so as to make the driver and OS's life miserable ... Would you consider dividing the RAID-5 into (I know, capacity robbing) two arrays, or dividing them up amoungst two SCSI busses (three on BUS 0 where you have the RAID-0, and four on BUS1 where you have the hotspare) to help improve the performance by dividing the SCSI activity amoungst the busses? 2) Simon, you may want to consider what happens when the controller is indicating busy, do you perform a timeout on the busy bit of the auxilliary status register, and if you do, what do you show to the OS (failed command? spawn a separate command issue thread to try again later? spin forever waiting for ready?). The BSDi BSD/OS driver, for example, simply `locks' waiting for the controller to get out of busy, which is the simplest approach to deal with, what should be a transitory situation. Also, you may wish to limit the number of outstanding commands to the controller (the UNIXWARE driver uses the lock on wait, and 32 CCB limit to reduce the chances of this problem affecting performance). The highest performance DPT driver in a Networking operating system (NETWARE) does the `spawn an issuing task' approach to allow processing of network card interrupts while waiting for the controller to come free. This may be your best approach considering you will no doubt be issuing `next' commands to the controller while in the context of the controller interrupt service routine. My assumption is that you timeout and send a fail up to the OS, which may explain the 0MB read capacity result shown in the log? I hope this helps -- Sincerely -- Mark Salyzyn --NeXT-Mail-2114881306-1-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message