Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Aug 1999 12:53:30 -0600 (MDT)
From:      "Kenneth D. Merry" <ken@kdm.org>
To:        maex-lists-freebsd-scsi@Space.Net (Markus Stumpf)
Cc:        scsi@FreeBSD.ORG
Subject:   Re: 3.2-STABLE: SCB 0x7 - timed out in datain phase, SEQADDR == 0x10e
Message-ID:  <199908311853.MAA19242@panzer.kdm.org>
In-Reply-To: <19990831191827.F21474@space.net> from Markus Stumpf at "Aug 31, 1999 07:18:27 pm"

next in thread | previous in thread | raw e-mail | index | archive | help
Markus Stumpf wrote...
> On Tue, Aug 31, 1999 at 09:37:27AM -0600, Kenneth D. Merry wrote:
> > It usually indicates a cabling or termination problem.  If the message is
> > "timed out in {datain,dataout,command} phase", the cause is often the same
> > -- cabling or termination.
> 
> I have a box I want to use as a web cache (squid2).
> 

[ ... ]

> Each controller has two
>     <IBM DNES-318350Y SA30> Fixed Direct Access SCSI-3 device 
>     80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
>     17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
> and one
>     <IBM DNES-309170Y S80K> Fixed Direct Access SCSI-3 device 
>     80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
>     8748MB (17916240 512 byte sectors: 255H 63S/T 1115C)
> connected.

[ ... ]

> Cabelling and termination checked and ok (AFAIK).
> No other SCSI devices present (CD rom is on isa)
> 
> When doing stress tests (simulating 300 parallel clients to the squid
> cache) I also get from time to time messages like these:
> 
> Aug 31 18:16:28 zuse /kernel: (da1:ahc0:0:1:0): SCB 0x6d - timed out in dataout phase, SEQADDR == 0x5d
> Aug 31 18:16:35 zuse /kernel: (da1:ahc0:0:1:0): BDR message in message buffer
> Aug 31 18:16:35 zuse /kernel: (da1:ahc0:0:1:0): SCB 0x1c - timed out in dataout phase, SEQADDR == 0x5d
> Aug 31 18:16:35 zuse /kernel: (da1:ahc0:0:1:0): no longer in timeout, status = 34b
> Aug 31 18:16:35 zuse /kernel: ahc0: Issued Channel A Bus Reset. 65 SCBs aborted
> 
> It happens on both controllers and with all disks used for the cache
> (high IO rates).
> 
> Any help welcome!
> 
> 	\Maex
> 
> (P.S. I think there are currently only "dataout phase" problems, as with
>   the stress tests I am also migrating data from the old cache to the new one,
>   so the hit ratio is zero, as all data has to be fetched from the parent
>   (the old cache) and is stored on disk.


There are several things to point out here.  You're probably aware of some
of them, but they often catch people off guard:

 - When running drives at LVD speeds, you need to have one of the "twisty"
   LVD SCSI cables.

 - LVD drives don't have terminators, so you have to use a twisty cable
   with a terminator block on the end.  (All the twisty cables I've seen
   have terminators.)

 - Justin has had problems before with bent pins on LVD cables.  They seem
   to be a bit fragile sometimes.  A bent pin can easily cause problems
   like the one above.

 - You can run into trouble sometimes at LVD data rates if your SCSI cables
   go too close to a power supply.  You should make sure your cables are
   routed away from your power supply.

Anyway, those are the most common causes of timeouts.  The power supply
interference problem seems to crop up more often at LVD speeds than with
lower speed busses.

And timeout problems like the one above are more likely to occur under high
load.  (Of course with some cabling/termination problems, they'll show up
under any load.  But "timed out in {datain,dataout,command} phase" problems
are more likely to show up when the bus is loaded more.)


Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199908311853.MAA19242>