Date: Fri, 29 May 1998 16:16:18 -0400 (EDT) From: Simon Shapiro <shimon@simon-shapiro.org> To: tcobb <tcobb@staff.circle.net> Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@FreeBSD.ORG>, "freebsd-current@freebsd.org" <freebsd-current@FreeBSD.ORG>, Mike Smith <mike@smith.net.au> Subject: RE: DPT driver fails and panics with Degraded Array Message-ID: <XFMail.980529161618.shimon@simon-shapiro.org> In-Reply-To: <509A2986E5C5D111B7DD0060082F32A402FAC3@freya.circle.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 29-May-98 tcobb wrote: ... > My problem report (most of which you snipped) pointed out a deficiency > in the DPT driver code which renders it useless in HA applications. I > believe that this deficiency is likely to be present in ALL VERSIONS of > this code, unless suddenly, people are putting the newest code in the > oldest versions of the OS. We all trust that you use FreeBSD 3.0-current and a DPT controller in said system. What you have failed to demonstrate is that the FreeBSD DRIVER is at fault. I am very anxious to discover and repair ANY bug in this driver as many people use this setup for mission critical work. Since I read the driver code few times and am somewhat familiar with it, I cannot find support to your claim that the failure you report (and we trust there is a failure) is induced in the driver. I am very interested in helping you solve your problem, but going into the driver and tearing up the code, searching for unknown and unlikely breakage is not an efficient use of my time and will most likely not advance our common goal of getting your system up and running and the problem eradicated. As you can observe elsewhere, a FreeBSD SCSI HBA driver in general (and the DPT driver in particular) is not involving itself with the contents of the SCSI commands passed to it from the kernel. Nor does it concern itself with the results of these commands. In the FreeBSD driver, similar to other DPT drivers (but not identical), I perfrom certain checks with the DPT hardware. These happen at boot time, BEFORE any of the kernel boot prompts you see. None of these has anything to do with what devices are attached to the bus, but strictly with the controller; Who are you? How are you doing?, etc. But never anything to do with attached devices. Reporting an array size zero (or one) is most likely caused by the array being DEAD, not degraded. Have you run the dptmgr verify function against the entire array? DOS does not perform much analysis and can be misleading. Unless you explicitly instruct the DPTMGR software to access data on a disk (or array), only the first sector is being accessed. Thus, it is entirely possible that the array is inaccessible beyond one sector, if that. Your symptoms can be caused by many causes, all of them within the realm of DPT hardware and attached devices. I really do not know the purpose of your messages on this subject. You really have not asked for help fixing the problem. Neither did you offer any diagnistics data to me, or the group as a whole. If your purpose is to create an acusation, it is well written, with the minor flow of it being inaccurate and quite wrong in its conclusion. If your purpose is to solve the problem, please send me the following (with a copy to the group, if you care) a. Exact Configuration; What CPU, what memory, DPT card model, type of cache memory, amount of cache, exact firmware version, exact BIOS version (NOT the same thing!) What disks are in which array, etc. I'll also need to know how the disks are programmed into the array, their bus and target IDs, etc. I need this information from both the hardware configuration and the logical view screens. b. Exact Setup; These disks, what brand and model? What firmware version on each disk? How are these disks mounted and in what? The type of cables you use, the type of terminators you use, etc. c. Is the system bootable now? Is it on the net? Can I have a root login on it for a while? d. Are you going to be available to run dptmgr for me and be my eyes and fingers, while in DOS or SU mode? e. Have you run all the DPT diagnostics to assure that the arrays are really healthy and accessible? Have you wiggled the wires to every device, while the tests are running? Have you printed out the DPT error log for the controller (from DPTMGR)? Have you run the statistics, to see the error counts and rates? Please provide me with this data, so I can try and help you. Simon --- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG 770.265.7340 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980529161618.shimon>