From owner-freebsd-scsi Tue Sep 17 19:13:24 1996 Return-Path: owner-freebsd-scsi Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id TAA15479 for freebsd-scsi-outgoing; Tue, 17 Sep 1996 19:13:24 -0700 (PDT) Received: from silvia.HIP.Berkeley.EDU (silvia.HIP.Berkeley.EDU [136.152.64.181]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id TAA15438; Tue, 17 Sep 1996 19:13:18 -0700 (PDT) Received: (from asami@localhost) by silvia.HIP.Berkeley.EDU (8.7.5/8.6.9) id TAA09645; Tue, 17 Sep 1996 19:13:16 -0700 (PDT) Date: Tue, 17 Sep 1996 19:13:16 -0700 (PDT) Message-Id: <199609180213.TAA09645@silvia.HIP.Berkeley.EDU> To: gibbs@freebsd.org CC: scsi@freebsd.org Subject: A couple of SCSI problems From: asami@freebsd.org (Satoshi Asami) Sender: owner-freebsd-scsi@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Justin, (1) A bus reset arriving at an unfortunate time can crash the system. This is what happened when I rebooted a machine connected to another via a 2-controller SCSI chain: === ## gdb -k kernel.11 vmcore.11 GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc...(no debugging symbols found)... IdlePTD 201000 current pcb at 1e81f0 panic: %s: Timed-out command times out again #0 0xf0113197 in boot () (kgdb) bt #0 0xf0113197 in boot () #1 0xf0113456 in panic () #2 0xf01d82e5 in ahc_timeout () #3 0xf010b17c in softclock () #4 0xf01b5e0c in splz_swi () #5 0xf012d72f in biowait () #6 0xf012bb3f in bread () #7 0xf019bc45 in ffs_update () #8 0xf019f25c in ffs_fsync () #9 0xf019de90 in ffs_sync () #10 0xf01328bb in sync () #11 0xf011306d in boot () #12 0xf0113456 in panic () #13 0xf01d82e5 in ahc_timeout () #14 0xf010b17c in softclock () #15 0xf01b5d87 in doreti_swi () #16 0xcf985 in ?? () #17 0xd1ae3 in ?? () #18 0xd5249 in ?? () #19 0xd5076 in ?? () #20 0x2f538 in ?? () #21 0xa6e1 in ?? () #22 0x1571b in ?? () #23 0x2e93b in ?? () #24 0x30e26 in ?? () #25 0x107e in ?? () === The crashed system was running a single "make" on a 35-disk CCD. (2) The ahc probe sometimes gets stuck right after it prints out "target foo refuses wide negotiation, using 8-bit transfers". This can happen for a variety of reasons, from loose cabling (on my home computer a while ago when the narrow SCSI connector on the CDROM was loose) to mystery (same symptom since last week, except the cable is not loose this time and I currently run the machine without CDROM) to bus resets. About bus resets: as I wrote above, one of the machines crashed when the other sent a bus reset as part of the boot process -- well, the crashed machine in turn sent a bus reset, which screwed up the booting process of the other machine, which hung, printing out this message and some other stuff (mostly incoherent, like "size: 0MB"). There are no narrow SCSI devices on this system. It booted fine when I power-cycled it. Whatever the reason, it seems like there is a minor race condition, as the symptom seems almost exactly like what used to happen before you came over to our lab and fixed a bug. What do you think, Justin? Satoshi