Date: Thu, 6 Aug 1998 22:43:51 -0600 (MDT) From: "Justin T. Gibbs" <gibbs@narnia.plutotech.com> To: Klaus Werner Krygier <krygier@kph.uni-mainz.de> Cc: current@FreeBSD.ORG Subject: Re: system lock-up (SMP, AHC_TAGENABLE, softupdates) Message-ID: <199808070443.WAA15719@narnia.plutotech.com> In-Reply-To: <Pine.BSF.3.96.980806202106.13657D-100000@majestix.kph.uni-mainz.de>
next in thread | previous in thread | raw e-mail | index | archive | help
> My last hope was CAM. Here (3.0CAM-19980712-SNAP) the system does not > freeze completely but becomes very slow on disk access and is also not > usable. One positive aspect: I get kernel messages. Every few seconds > the following message is printed out for ever: > > Timedout SCB handled by another timeout > Timedout SCB handled by another timeout > last message repeated 4 times > last message repeated 13 times > last message repeated 30 times > Timedout SCB handled by another timeout > Timedout SCB handled by another timeout > > etc. You are not the only person to report this problem, but I'm at a loss (at least hardware wise) to reproduce it. Mark Murry has also been able to make this happen but it only if he is running SMP. This makes me believe that there is a problem in our clock/timeout code that fails under SMP. Unfortunately I don't have any SMP hardware. Essentially what the messages are saying is that the timeouts scheduled for SCSI I/O are not canceled by the I/O completing. The timeout handler notices this and complains. This simply shouldn't happen. If you are interested in helping debug this problem, the strategy I would take is to turn the printf for the above message into a panic, and then go grovel around in the timeout data structures looking for corruption. -- Justin To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199808070443.WAA15719>