Date: Tue, 17 Sep 1996 19:13:16 -0700 (PDT) From: asami@freebsd.org (Satoshi Asami) To: gibbs@freebsd.org Cc: scsi@freebsd.org Subject: A couple of SCSI problems Message-ID: <199609180213.TAA09645@silvia.HIP.Berkeley.EDU>
next in thread | raw e-mail | index | archive | help
Justin,
(1) A bus reset arriving at an unfortunate time can crash the system.
This is what happened when I rebooted a machine connected to
another via a 2-controller SCSI chain:
===
## gdb -k kernel.11 vmcore.11
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.13 (i386-unknown-freebsd),
Copyright 1994 Free Software Foundation, Inc...(no debugging symbols found)...
IdlePTD 201000
current pcb at 1e81f0
panic: %s: Timed-out command times out again
#0 0xf0113197 in boot ()
(kgdb) bt
#0 0xf0113197 in boot ()
#1 0xf0113456 in panic ()
#2 0xf01d82e5 in ahc_timeout ()
#3 0xf010b17c in softclock ()
#4 0xf01b5e0c in splz_swi ()
#5 0xf012d72f in biowait ()
#6 0xf012bb3f in bread ()
#7 0xf019bc45 in ffs_update ()
#8 0xf019f25c in ffs_fsync ()
#9 0xf019de90 in ffs_sync ()
#10 0xf01328bb in sync ()
#11 0xf011306d in boot ()
#12 0xf0113456 in panic ()
#13 0xf01d82e5 in ahc_timeout ()
#14 0xf010b17c in softclock ()
#15 0xf01b5d87 in doreti_swi ()
#16 0xcf985 in ?? ()
#17 0xd1ae3 in ?? ()
#18 0xd5249 in ?? ()
#19 0xd5076 in ?? ()
#20 0x2f538 in ?? ()
#21 0xa6e1 in ?? ()
#22 0x1571b in ?? ()
#23 0x2e93b in ?? ()
#24 0x30e26 in ?? ()
#25 0x107e in ?? ()
===
The crashed system was running a single "make" on a 35-disk CCD.
(2) The ahc probe sometimes gets stuck right after it prints out
"target foo refuses wide negotiation, using 8-bit transfers".
This can happen for a variety of reasons, from loose cabling (on
my home computer a while ago when the narrow SCSI connector on the
CDROM was loose) to mystery (same symptom since last week, except
the cable is not loose this time and I currently run the machine
without CDROM) to bus resets. About bus resets: as I wrote above,
one of the machines crashed when the other sent a bus reset as
part of the boot process -- well, the crashed machine in turn sent
a bus reset, which screwed up the booting process of the other
machine, which hung, printing out this message and some other
stuff (mostly incoherent, like "size: 0MB"). There are no narrow
SCSI devices on this system. It booted fine when I power-cycled
it.
Whatever the reason, it seems like there is a minor race
condition, as the symptom seems almost exactly like what used to
happen before you came over to our lab and fixed a bug.
What do you think, Justin?
Satoshi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609180213.TAA09645>
