From owner-freebsd-scsi Fri May 29 11:13:40 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id LAA29215 for freebsd-scsi-outgoing; Fri, 29 May 1998 11:13:40 -0700 (PDT) (envelope-from owner-freebsd-scsi@FreeBSD.ORG) Received: from sendero.simon-shapiro.org (sendero.simon-shapiro.org.142.69.207.in-addr.arpa [207.69.142.25] (may be forged)) by hub.freebsd.org (8.8.8/8.8.8) with SMTP id LAA29185 for ; Fri, 29 May 1998 11:13:21 -0700 (PDT) (envelope-from shimon@sendero.simon-shapiro.org) Received: (qmail 417 invoked by uid 1000); 29 May 1998 19:14:42 -0000 Message-ID: X-Mailer: XFMail 1.3 [p0] on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <509A2986E5C5D111B7DD0060082F32A402FABC@freya.circle.net> Date: Fri, 29 May 1998 15:14:42 -0400 (EDT) Reply-To: shimon@simon-shapiro.org Organization: The Simon Shapiro Foundation From: Simon Shapiro To: tcobb Subject: RE: DPT driver fails and panics with Degraded Array Cc: "simon@simon-shapiro.org" , "freebsd-scsi@freebsd.org" , "freebsd-current@freebsd.org" Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On 28-May-98 tcobb wrote: > I have a DPT3344UW/2 running an external 24GB array. OS is FreeBSD > CURRENT circa 5/18/98. I'm running the latest available firmware flash > for the card, all on a P5-233MMX with 128MB RAM. What version is ``latest''? I have several ``latest'' here. > Recently I lost a harddrive in my 24GB RAID5 array. The array was > configured with a HOT SPARE which should have allowed it to rebuild > completely online, with no interruption in service (except some minor > slowdowns, perhaps). While the HARDWARE worked well, the DPT DRIVER > failed miserably. the DPT driver HAS NOTHING TO DO with the RAID array. It is seen strinctly as a disk. > When my array went into degraded mode, the DPT DRIVER froze access to > the partitions. Upon reboot, during device probe, the DPT DRIVER > returned a 1 SECTOR (0 MB) sense for the array, despite the fact that > the array was operating properly (though degraded). After this, the > kernel panic'd before completing the boot process with a "Page Fault in > Supervisor Mode" error, and continued to panic this way until the DPT > Array was COMPLETELY REBUILT OFFLINE (requiring me to boot into DOS and > do it - doing the rebuild of that size RAID5 array takes more than an > hour). After a complete rebuild, the DPT DRIVER showed the array sizes > correctly. This is strange. I routinely (although rarely coluntarily) run into degraded mode. The size reported by the DPT driver, is the size rerported to the driver by the DPT firmware. If it shows as ZERO, it is either ZERO, or the array is more than degraded (dead). > During this process, booting into DOS revealed the array to be fine, > even while the array was degraded -- it also wasn't confused by degraded > mode and showed correct partition information. So, was it fine, or was it degraded? > I believe that the DPT DRIVER is not correctly sensing that the array is > okay, even though it is in degraded mode, and incorrectly returns > sector/MB values which panic the kernel. I don't recommend depending on > the proper operation of this driver for your High-Availability needs. I beg to differ. The DPT driver does not do any sensing at all. The SCSI layer calls for SENSE commands. The DPT driver is simply a protocol translator. I do not even look at the commands, nor their results/contents. Extending your recommendation, I'll repeat what was said here endlessly; Do NOT use 3.0-CURRENT for any mission critical software. Extending it further, do no use any computer software for mission critical under any conditions. All systems fail, except those with the power off. > HISTORY > I've used DPT in FreeBSD since last November, first with the hacked > 2.2.2 driver. I upgraded to 2.2.6 to fix a MBUF leak that was crashing > me about once per week. As 2.2.6, the MBUF leak disappeared and was > replaced with a once every 2-3 day panic which it appeared was not going > to get fixed by anyone (bidone: buffer not busy). So, I bit the bullet > and upgraded recently to 3.0, which seemed to fix both of these prior > panics only to reveal that the supposedly "high availability" software > driver for my HA hardware is miserable during the most critical times. It may help, in the future, if you contact me for help. >From your description, you have a marginal disk subsystem. Either bad cabling, bad power, bad controller. None of your symptoms is relevant to the DPT driver. Simon --- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG 770.265.7340 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message