Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Apr 97 21:52:02 +0200
From:      cracauer@wavehh.hanse.de (Martin Cracauer)
To:        jdp@polstra.COM
Cc:        freebsd-current@freebsd.org
Subject:   Re: ahc crashes 2.2-STABLE machine during dump to DAT tape
Message-ID:  <9704101952.AA01412@wavehh.hanse.de>
References:  <199704101429.HAA00725@austin.polstra.com>

next in thread | previous in thread | raw e-mail | index | archive | help
John Postra wrote:

>In an up-to-date RELENG_2_2 kernel (CVSupped last night: 9 April,
>20:45 PDT), the system cannot make it thru a backup without crashing.
>It hangs with the following messages on the screen:

>ahc0:A:0: no active SCB for reconnecting target - issuing ABORT
>SAVED_TCL == 0x0
>ahc0:A:0: Warning - unknown message received from target (0xff). Rejecting
>ahc_intr: seqint, intstat == 0xc1, scsisigi = 0xf6
>ahc0:A:0: Target did not send an IDENTIFY message. LASTPHASE = 0x0, SAVED_TCL == 0x0
>ahc0: Issued Channel A Bus Reset.  4 SCBs aborted

>At that point the machine is unresponsive to its keyboard, and I have to
>press the Big Red Button.

The same happened for me two times with 2.2.1-release.

>This problem is not new with last night's kernel -- I've been seeing
>it for a couple of weeks, maybe longer.  (I haven't been trying
>backups very often lately.)  Before I upgraded to 2.2, this problem
>never appeared on the machine.

Same here.

>The kernel has DDB configured into it, and it's built with debugging
>symbols.  But since it doesn't panic, I haven't been able to get into
>the debugger to get a stack trace.

My kernel doesn't have DDB or -g.

>I should mention that the CD-ROM drive has been added in the past
>few weeks.  It's a replacement for one that crapped out.  It seems
>to work fine.  I can't remember whether these crashes were happening
>before I put it in or not.  Maybe it's a clue.  Generally when these
>crashes occur, the machine is idle except for the dump in progress.

These chrashes appeared in 2.2 (or 2.2.1? not sure) without any
changes without any changes to the SCSI setup, not even reconnecting
drives.

I have :
2x ahc (one older, one "utra") narrow
1x "NEC D3825 0410" type 0 fixed SCSI 2 
2x (one on each ahc) "Quantum XP34300 81HB" (Altas 1 4 GB)

>I'm attaching my dmesg.boot and config files.  I'd be happy to run
>some tests if there's anything that would help to track it down.

As a hint, I'm pretty sure the problem is load/timing-related. My two
hangs appeared when I had the disks very active on an asynchronous
mounted ccd over the two atlas drives (no, I didn't loose any data :-),
driven by a PPRO-200.

I'll file a proper PR when the problem can't be fixed instantly, I
just wanted to add to John's report.

Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin_Cracauer@wavehh.hanse.de http://cracauer.cons.org  Fax.: +4940 5228536
"As far as I'm concerned,  if something is so complicated that you can't ex-
 plain it in 10 seconds, then it's probably not worth knowing anyway"- Calvin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9704101952.AA01412>