Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Jun 1999 13:50:13 -0500 (CDT)
From:      Joe Greco <jgreco@ns.sol.net>
To:        scsi@freebsd.org
Subject:   FreeBSD panics with Mylex DAC960SX
Message-ID:  <199906291850.NAA83997@aurora.sol.net>

next in thread | raw e-mail | index | archive | help
Hello,

First, cool stuff in 3.X!  Hats off to you guys.

I have one minor issue that I am hoping is a simple fix.

I'm using Mylex DAC960SX SCSI-to-SCSI RAID controllers on an ASUS P2B-DS
motherboard, off of the onboard SCSI controller.  This is a neat gadget
that makes a bunch of drives look like a single SCSI target.

Now...  here's the problem.  The unit takes a while to start up (~60s)
from power on, and until it reports "STARTUP COMPLETE", FreeBSD blows
chunks when trying to access it.

In particular, when the Mylex freaks out and thinks half its disks are
dead (duh forgot to power them on), the startup sequence never completes,
and FreeBSD will sit there doing boot-panic-boot-panic-etc.  This is not
very gracious, and is a bit irritating since the serial console I need to
talk to the Mylex is on the box...

So, my _real_ issue is the following panic:

Booting [kernel]...               
/kernel text=0x104404 data=0x15cac+0x1bf4c syms=[0x4+0x1daa0+0x4+0x1ed04]
Copyright (c) 1992-1999 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 3.1-RELEASE #3: Mon Apr 12 06:06:16 CDT 1999
    root@xxxxxx:/usr/src/sys/compile/SPOOL
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Xeon/Celeron (686-class CPU)
  Origin = "GenuineIntel"  Id = 0x652  Stepping=2
  Features=0x183fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,<b24>>
real memory  = 536870912 (524288K bytes)
avail memory = 520056832 (507868K bytes)
Programming 24 pins in IOAPIC #0
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  1, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  0, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
Preloaded elf kernel "kernel" at 0xf0275000.
Probing for devices on PCI bus 0:
chip0: <Intel 82443BX host to PCI bridge> rev 0x03 on pci0.0.0
chip1: <Intel 82443BX host to AGP bridge> rev 0x03 on pci0.1.0
chip2: <Intel 82371AB PCI to ISA bridge> rev 0x02 on pci0.4.0
chip3: <Intel 82371AB Power management controller> rev 0x02 on pci0.4.3
ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> rev 0x00 int a irq 19 on pci0.6.0
ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
chip4: <PCI to PCI bridge (vendor=1011 device=0024)> rev 0x03 on pci0.10.0
Probing for devices on PCI bus 1:
Probing for devices on PCI bus 2:
de0: <Digital 21140A Fast Ethernet> rev 0x22 int a irq 18 on pci2.4.0
de0: SMC 9332BDT 21140A [10-100Mb/s] pass 2.2
de0: address 00:e0:29:2b:e1:08
de1: <Digital 21140A Fast Ethernet> rev 0x22 int a irq 19 on pci2.5.0
de1: SMC 9332BDT 21140A [10-100Mb/s] pass 2.2
de1: address 00:e0:29:2b:e1:09
Probing for devices on the ISA bus:
sc0 on isa
sc0: VGA color <16 virtual consoles, flags=0x0>
ed0 not found at 0x280
atkbdc0 at 0x60-0x6f on motherboard
atkbd0 irq 1 on isa
psm0 not found
sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
sio0: type 16550A, console
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
sio2: configured irq 5 not in bitmap of probed irqs 0
sio2 not found at 0x3e8
sio3: configured irq 9 not in bitmap of probed irqs 0
sio3 not found at 0x2e8
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
ppc0 at 0x378 irq 7 on isa
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/9 bytes threshold
nlpt0: <generic printer> on ppbus 0
nlpt0: Interrupt-driven port
ppi0: <generic parallel i/o> on ppbus 0
plip0: <PLIP network interface> on ppbus 0
vga0 at 0x3b0-0x3df maddr 0xa0000 msize 131072 on isa
npx0 on motherboard
npx0: INT 16 interface
we0 at 0x2e8 on isa
we0: kernel is keeping watchdog alive
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via pin 2
IP packet filtering initialized, divert disabled, rule-based forwarding disabled, logging limited to 100 packets/entry
Waiting 2 seconds for SCSI devices to settle
SMP: AP CPU #1 Launched!
de1: enabling 100baseTX port
changing root device tda0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST34371W 0484> Fixed Direct Access SCSI-2 device 
da0: 40.0MB/s transfers (20.0MHz, offset 15, 16bit), Tagged Queueing Enabled
da0: 4148MB (8496884 512 byte sectors: 255H 63S/T 528C)
o da0s1a
da1 at ahc0 bus 0 target 1 lun 0
da1: <MYLEX DAC960SX138928B5 4332> Fixed Direct Access SCSI-2 device 
da1: 40.0MB/s transfers (20.0MHz, offset 16, 16bit), Tagged Queueing Enabled
da1: A
de0: autosense failed: cable problem?
swapon: adding /dev/da0s1b as swap device
Automatic reboot in progress...
/dev/rda0s1a: FILESYSTEM CLEAN^M; SKIPPING CHECK
S
^M/dev/rda0s1a: 
clean, 138968 frFee (296 frags, 1a7334 blocks, 0.2t% fragmentation)a
l trap 18: integer divide fault while in kernel mode
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
instruction pointer     = 0x8:0xf014a681
stack pointer           = 0x10:0xfa66b9d8
frame pointer           = 0x10:0xfa66ba00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 18 (fsck)
interrupt mask          =  <- SMP: XXX
trap number             = 18
panic: integer divide fault
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

syncing disks... done
(da1:ahc0:0:1:0): SYNCHRONIZE CACHE. CDB: 35 0 0 0 0 0 0 0 0 0 
(da1:ahc0:0:1:0): NOT READY
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset called on cpu#1
cpu_reset: Stopping other CPUs
cpu_reset: Restarting BSP
cpu_reset_proxy: Grabbed mp lock for BSP
cpu_reset_proxy: Stopped CPU 1

I apologize for not reproducing this on a 3.2R box but I assure you that
it also panics in fsck on 3.2R in what appears to be an identical manner.
The panic does seem to be caused by fsck - I can enter single user mode
just fine.

My guess is that the integer divide fault results from the device reporting
a size of zero (strictly a guess though!).  Normally, size is reported as

da1: <MYLEX DAC960SX138928B5 4332> Fixed Direct Access SCSI-2 device 
da1: 40.0MB/s transfers (20.0MHz, offset 16, 16bit), Tagged Queueing Enabled
da1: 138928MB (284524544 512 byte sectors: 255H 63S/T 17710C)

but during all of these crash-boots, the third line is

da1: <MYLEX DAC960SX138928B5 4332> Fixed Direct Access SCSI-2 device 
da1: 40.0MB/s transfers (20.0MHz, offset 16, 16bit), Tagged Queueing Enabled
da1: A

If I can provide further information to assist in tracking down this bug,
please let me know.

Also, I was wondering more generally about what the proper way to deal with
a device such as this is.  Assuming FreeBSD didn't actually crash when
trying to access the device, it is still possible to attempt booting when
the DAC controller is not ready, which will result - presumably - in fsck
exiting and complaining about that filesystem.  What is the "correct" way
to wait for something like this to become ready?  Is there a "correct" way,
even?

Thanks,

... Joe

-------------------------------------------------------------------------------
Joe Greco - Systems Administrator			      jgreco@ns.sol.net
Solaria Public Access UNIX - Milwaukee, WI			   414/342-4847


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906291850.NAA83997>