From owner-freebsd-current Fri May 29 01:36:57 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id BAA03539 for freebsd-current-outgoing; Fri, 29 May 1998 01:36:57 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from freya.circle.net (freya.circle.net [209.95.95.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id BAA03510 for ; Fri, 29 May 1998 01:36:44 -0700 (PDT) (envelope-from tcobb@staff.circle.net) Received: by freya.circle.net with Internet Mail Service (5.5.1960.3) id ; Fri, 29 May 1998 04:36:07 -0400 Message-ID: <509A2986E5C5D111B7DD0060082F32A402FACE@freya.circle.net> From: tcobb To: "'Karl Pielorz'" Cc: current@FreeBSD.ORG Subject: RE: DPT driver fails and panics with Degraded Array Date: Fri, 29 May 1998 04:36:04 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.1960.3) Content-Type: text/plain Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > -----Original Message----- > From: Karl Pielorz [mailto:kpielorz@tdx.co.uk] > Sent: Friday, May 29, 1998 4:27 AM > To: tcobb > Cc: current@freebsd.org > Subject: Re: DPT driver fails and panics with Degraded Array > > > tcobb wrote: > > > > With an array of that size, on a machine that important - did > > > you not test > > > to see what would happen with a failed drive? > > > > Despite pre-certification testing, something will be > different when you > > have a failure in production. The difference in our case, > I'm guessing, > > was that the array is now 60-75% full, and the OS version > is different, > > and the system was under heavy access load, too. The > original driver > > was an over-hacked version stuffed into 2.2.2, the newest driver IS > > better integrated, and actually faster, but obviously > unable to handle > > the under-load failure situation in exactly the way we had > it happen. > > We did our tests under load, but not with a 'full' array (it > was at about > 20-30%)... We've never changed the operating system 'under' > it, I know that > it would be checked again if the driver was ever changed (and > to be honest > it would have to be a pretty dire problem to change the > driver while the > machines 'online' - it would normally be made off-line or > taken out the loop > first). Undoubtably our testing of the most recent upgrades was less adequate than I'd realized. We were under some pressure to resolve the unpredictable (every 2 days or so) "biodone" panics that had appeared 2 weeks after upgrading to 2.2.6. The annoying thing was that these panics did not occur during the 2.2.6 testing phase that we DID do, but that just means we didn't run it long enough, I suppose. > I'll also admit that we were looking at the DPT solution for our next > FreeBSD box... I think we'll either wait a while now (or I'll > just keep > quiet and add a few extra weeks to the testing phase for the > machine - I'd > prefer the latter for obvious reasons... ;-) Smart move :) I'll certainly release whatever improvements to the driver we come up with. I'm currently under some pressure to implement a complete alternative (SCSI-to-SCSI or another OS entirely) but I'm pushing for finding a way to stabilize our current solution. I respect the DPT HARDWARE, and have high hopes that some positive changes can be made to the driver. -Troy Cobb Circle Net, Inc. http://www.circle.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message