From owner-freebsd-scsi Fri Mar 16 15:52:39 2001 Delivered-To: freebsd-scsi@freebsd.org Received: from mail.wolves.k12.mo.us (mail.wolves.k12.mo.us [207.160.214.1]) by hub.freebsd.org (Postfix) with ESMTP id BB3C237B718; Fri, 16 Mar 2001 15:52:34 -0800 (PST) (envelope-from cdillon@wolves.k12.mo.us) Received: from mail.wolves.k12.mo.us (cdillon@mail.wolves.k12.mo.us [207.160.214.1]) by mail.wolves.k12.mo.us (8.9.3/8.9.3) with ESMTP id RAA28052; Fri, 16 Mar 2001 17:52:32 -0600 (CST) (envelope-from cdillon@wolves.k12.mo.us) Date: Fri, 16 Mar 2001 17:52:32 -0600 (CST) From: Chris Dillon To: James FitzGibbon Cc: , Subject: Re: Mylex eXtremeRAID 2000 timeout/hang In-Reply-To: <20010316173716.E11769@ehlo.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Fri, 16 Mar 2001, James FitzGibbon wrote: > We are trying to install a Mylex eXtreme 2000 card with a Dell > Powervault 12 drive SCA housing. The drives in the array are > numbered 0-5 and 8-13. The backplane of the array is id 15. > > During the kernel probe, we see the message > > mly0: drive at 03:15 not responding > > five times after the "waiting 15 seconds for SCSI devices to spin > up" message, and then nothing else. The system doesn't hang, but > it never goes anywhere from there. This is with F/W 6.00-00 and > BIOS 6.00-01. How long did you wait? I have a similar problem with an AcceleRAID 170 in -STABLE where I have to wait several minutes before it finally gets around to booting. I get the following, with about 30 to 40 seconds in between each "error": Waiting 7 seconds for SCSI devices to settle mly0: physical device 0:6 sense data received mly0: sense key 5 asc 00 ascq 00 mly0: info 00000000 csi 00000000 mly0: physical device 0:6 sense data received mly0: sense key 5 asc 00 ascq 00 mly0: info 00000000 csi 00000000 mly0: physical device 0:6 sense data received mly0: sense key 5 asc 00 ascq 00 mly0: info 00000000 csi 00000000 mly0: physical device 0:6 sense data received mly0: sense key 5 asc 00 ascq 00 mly0: info 00000000 csi 00000000 mly0: physical device 0:6 sense data received mly0: sense key 5 asc 00 ascq 00 mly0: info 00000000 csi 00000000 da0 at mly0 bus 1 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 17480MB (35799040 512 byte sectors: 255H 63S/T 2228C) Mounting root from ufs:/dev/da0s1a Device 0:6:0 is the enclosure management device (the "backplane", I guess) in the SuperMicro SuperServer 6040, which I think is available separately as their CSE-031 drive enclosure. IIRC, the actual enclosure management device is the QLogic GEM. I also had a problem after a recent upgrade to 4.3-BETA where if I left my extra drive in the chassis, I would get the following error just before the 0:6:0 errors: (probe32:mly0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe32:mly0:0:2:0): error code 0 Device 0:2:0 was a spare drive (not configured as a hot spare, IIRC, just sitting there completely unconfigured, waiting for me to do something with it, or sacrifice itself as a warm spare if another drive died). After waiting for the previously mentioned device 0:6:0 errors to go by, the system would panic immediately afterwards: Fatal trap 18: integer divide fault while in kernel mode mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 instruction pointer = 0x8:0xc01477f1 stack pointer = 0x10:0xff806e04 frame pointer = 0x10:0xff806e1c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = <- SMP: XXX trap number = 18 panic: integer divide fault mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 boot() called on cpu#0 syncing disks... done Uptime: 4m23s mly0: flushing cache...done On a hunch I removed the unconfigured disk from the drive enclosure and the system booted just fine... Mike? :-) P.S.: I'm getting stuff in my dmesg buffer from _previous_ boots... I've never seen a system do that. That's how I could cut/paste that panic. :-) Is that a bug, or a feature? Its a nice feature (which my other 4.2-STABLE boxes don't seem to have). Some of the information seems to get corrupted (mixed-up might be a better explanation) around the time of a panic, for example: [...snip...] mly0: physical device 0:6 sense data received mly0: secuous mode disabled Fatal trap 18: integer divide fault while in kernel mode [...snip...] All new dmesg info is just fine, of course, and all "normal" reboot sequences (without a powerdown) seem to preserve the old dmesg info perfectly. Too bad more of my boxes don't exhibit this feature. :-) -- Chris Dillon - cdillon@wolves.k12.mo.us - cdillon@inter-linc.net FreeBSD: The fastest and most stable server OS on the planet. For IA32 and Alpha architectures. IA64, PPC, and ARM under development. http://www.freebsd.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message