Date: Tue, 29 Jun 1999 20:58:59 -0500 (CDT) From: Joe Greco <jgreco@ns.sol.net> To: ken@plutotech.com (Kenneth D. Merry) Cc: scsi@freebsd.org Subject: Re: FreeBSD panics with Mylex DAC960SX Message-ID: <199906300159.UAA13916@aurora.sol.net> In-Reply-To: <199906292300.RAA29666@panzer.kdm.org> from "Kenneth D. Merry" at "Jun 29, 1999 5: 0:50 pm"
next in thread | previous in thread | raw e-mail | index | archive | help
> > but during all of these crash-boots, the third line is > > > > da1: <MYLEX DAC960SX138928B5 4332> Fixed Direct Access SCSI-2 device > > da1: 40.0MB/s transfers (20.0MHz, offset 16, 16bit), Tagged Queueing Enabled > > da1: A > > That should probably read "Attempt to query device size failed ...." > > You may be losing characters over the serial console or something. No. When done on a VGA console, it shows a graphic character or two. It does not interleave characters from the "changing root device..." though. :-) > > If I can provide further information to assist in tracking down this bug, > > please let me know. > > My first guess is that it's happening during the open() routine, for some > reason. That's why fsck seems to cause the problem. > > You're probably right about the device returning a size of zero. It isn't > immediately clear to me why the open routine would cause a panic, *unless* > the Mylex unit returns good status for the read capacity command, but > returns a capacity of 0. > > It would be helpful to get a stack trace from the machine, if you can. > Enabling DDB at least will give us a DDB stack trace. Okay. Alas, I must go physically bop the power on the machine to cause the Mylex to reset; once it is up and running it is _very_ happy. So I may not get to this for the next day or so. > > Also, I was wondering more generally about what the proper way to deal with > > a device such as this is. Assuming FreeBSD didn't actually crash when > > trying to access the device, it is still possible to attempt booting when > > the DAC controller is not ready, which will result - presumably - in fsck > > exiting and complaining about that filesystem. What is the "correct" way > > to wait for something like this to become ready? Is there a "correct" way, > > even? > > Well, it really depends on how the device behaves. Here's what happens > after the initial probe phase: > > - the da driver sends a read capacity to the disk, with a retry count of 4 > and a timeout of 5 seconds. > > 1. The read capacity succeeds, and the probe continues normally. > 2. The read capacity fails, and one of a few things happen: > > 1. If the error has an associated error recovery action, > we may send a start unit to the disk, or one TUR every > half second for a minute. Then we retry the original > command. > 2. If the error has no associated error recovery action, > we just retry it until the retry count is exhausted. > > My guess is that the error returned by the Mylex unit may not be an > error with an associated recovery action. So we just retry it four times > and then report the "Attempt to query device size failed ..." where ... is > the error. > > Unfortunately, you're not getting the error printout, probably because of > serial console weirdness. Could you try booting with -v? That will cause > the full sense information for the error to get printed out, and maybe > we'll have a better chance of figuring out what the error is. > > Also, once you boot up in single user mode, you might try the following > camcontrol command: > > camcontrol cmd -n da -u 1 -v -c "25 0 0 0 0 0 0 0 0 0" -i 8 "i4 i4" > > That will issue a read capacity command to da1, and print out the total > number of blocks in the disk and the block size. The -v will tell > camcontrol to print out sense information. I will be delighted to. :-) Unfortunately, I will probably have to putz with it a bit, because the Mylex generally becomes ready within a minute of me making it to single user mode. Sigh. I'll also see if it is any different if I break the array, which also causes a panic (but might result in different specifics). ... Joe ------------------------------------------------------------------------------- Joe Greco - Systems Administrator jgreco@ns.sol.net Solaria Public Access UNIX - Milwaukee, WI 414/342-4847 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906300159.UAA13916>