Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Aug 1998 06:11:48 +0900
From:      Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp>
To:        hausen@punkt.de
Cc:        Tetsuro FURUYA <tfu@ff.iij4u.or.jp>
Subject:   Re: Weird SCSI errors
Message-ID:  <199808182111.GAA06786@galois.tf.or.jp>
In-Reply-To: Your message of "Tue, 18 Aug 1998 17:42:01 %2B0200 (CEST)"
References:  <199808181542.RAA00725@hugo10.ka.punkt.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi, I'm T.Furuya.

In Message-ID: <199808181542.RAA00725@hugo10.ka.punkt.de>
"Patrick M. Hausen" <hausen@punkt.de> wrote:

> Hi all!
> 
> In the last couple of months our central file&print&everything server
> crashed a couple of times. These crashes seem to happen more and more
> often.
> 
> The system just comes to a halt, it doesn't even panic or write an
> entry into /var/adm/messages.

This is clearly queer.
I have experience that the sectors on eide drive are broken,
and access to the bad sectors causes error.
But at that time, the system display error messages like "XXXX i/o error",
and that process dies but the system recover.
(At first, system halted like your system, and I fixed that defects.)
So your driver code does not work well, probably timeout routine.
For example, ahc_timeout,,,,.
But I did't read your driver code, so I cannot say any further.
Sorry, I should not write here.
Would debug option of driver code help you ?

> 
> The console is filled with error messages like this:
> 
> SEQADDR=0x6 SCSISEQ=0x12 SSTAT0=0x5 SSTAT1=0xa
> sd0(ahc0:0:0) SCB 7: Flags=0x1
> sd0(ahc0:0:0) no longer in timeout
> ahc0: Issue Channel A Bus Reset
> 2 SCBs aborted
> swap-pager: indefinite wait_buffer: device: 1025 blkno(xxx) size=xxx
>                                        These change ----^---------^


This statement is displayed by /usr/src/sys/vm/swap_pager.c .
Because the one process cannot access disk, any other waiting processes
cannot access disk and are waiting.


> sd0(ahc0:0:0) SCB 0x7 timed out while idle
> LASTPHASE=0x1 SCSISIGI=0x0
> 
> This is from writing it down with pencil and paper, so there may be
> mistakes.
> 
> My dmesg is this:

> FreeBSD 2.2.7-RELEASE #0: Fri Aug 14 18:45:16 CEST 1998
>     root@:/usr/src/sys/compile/HUGO
> CPU: Pentium/P54C (100.23-MHz 586-class CPU)
>   Origin = "GenuineIntel"  Id = 0x525  Stepping=5

> ahc0 <Adaptec 2940 SCSI host adapter> rev 0 int a irq 11 on pci0:13:0
> ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs
> (ahc0:0:0): "Quantum XP34300W L912" type 0 fixed SCSI 2
> sd0(ahc0:0:0): Direct-Access 4101MB (8399520 512 byte sectors)
> (ahc0:1:0): "FUJITSU M2934S-512 0122" type 0 fixed SCSI 2
> sd1(ahc0:1:0): Direct-Access 4153MB (8506782 512 byte sectors)
> (ahc0:2:0): "FUJITSU M2934S-512 0122" type 0 fixed SCSI 2
> sd2(ahc0:2:0): Direct-Access 4153MB (8506782 512 byte sectors)
> (ahc0:3:0): "FUJITSU M2694ES-512 8139" type 0 fixed SCSI 2
> sd3(ahc0:3:0): Direct-Access 1033MB (2117025 512 byte sectors)


If you install kernel debugger DDB into kernel, will help you little.
When kernel hangs up, invoke ddb by typing Cntrl-Alt-Esc from system
console. Don't use X, at this time. X ignores Cntrl-Alt-Esc.
And make timeout forcibly, and when disk seek ended, type 'continue'.
If several trial failed, then type 'panic', this will reboot system
safely.
And fsck.

At least, this prescription worked when eide wd driver was bad.

                               Tetsuro, Furuya. ht5t-fry@asahi-net.or.jp

========================================================================
TEL: 048-852-3520    FAX: 048-858-1597			    ||      
E-Mail:							   8==------
     ht5t-fry@asahi-net.or.jp , tfu@ff.iij4u.or.jp     	*   ||
pgp-fingerprint:				       \|/
     pub  Tetsuro FURUYA <ht5t-fry@asahi-net.or.jp>
      Key fingerprint = F1 BA 5F C1 C2 48 1D C7  AE 5F 16 ED 12 17 75 38
=========================================================================

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199808182111.GAA06786>