Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Mar 1998 21:08:27 -0600
From:      David Kelly <dkelly@hiwaay.net>
To:        freebsd-scsi@FreeBSD.ORG
Subject:   Re: Signal 11's.. (and SCB problems) 
Message-ID:  <199803300308.VAA03628@nospam.hiwaay.net>
In-Reply-To: Message from The Hermit Hacker <scrappy@hub.org>  of "Sun, 29 Mar 1998 03:56:01 -0400." <Pine.BSF.3.96.980329035245.1536D-100000@thelab.hub.org> 

next in thread | previous in thread | raw e-mail | index | archive | help
The Hermit Hacker writes:
> 
> 	If so, would SCB explain why, with an adaptec 2940UW controller
> and SCB enabled, I'm seeing what *feel* like SCSI bus hangs?  Where I can
> telnet to the machine, get the beginnings of the 'login', but no login
> prompt?  Machine is still pingable and everything?
> 
> 	My configure has the following lines:
> 
> controller      ahc0
> options AHC_SCBPAGING_ENABLE
> options AHC_TAGENABLE
> options AHC_ALLOW_MEMIO
> 
> 	And the system is running STABLE...
> 
> 	IF the SCB should be removed, what about the other two?  Any
> similar problems known with those?

I've had 3 similar incidents the past several weeks, also with STABLE
(a.k.a. RELENG_2_2, right?) In my situation I have root and swap on a 2G
drive on a 2940 (notice this is an old 2940), more swap and my CVS
repository on a 9G drive connected to an '875:

ahc0 <Adaptec 2940 SCSI host adapter> rev 0 int a irq 9 on pci0:9:0
ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs
ahc0 waiting for scsi devices to settle
(ahc0:0:0): "SEAGATE ST32550N 0021" type 0 fixed SCSI 2
sd0(ahc0:0:0): Direct-Access 2047MB (4194058 512 byte sectors)
[...]
ncr0 <ncr 53c875 fast20 wide scsi> rev 3 int a irq 11 on pci0:11:0
ncr0 waiting for scsi devices to settle
(ncr0:0:0): WIDE SCSI (16 bit) enabled(ncr0:0:0): 10.0 MB/s (200 ns, offset 15)
(ncr0:0:0): "IBM OEM DCHS09W 2222" type 0 fixed SCSI 2
sd1(ncr0:0:0): Direct-Access 
sd1(ncr0:0:0): WIDE SCSI (16 bit) enabled
sd1(ncr0:0:0): 20.0 MB/s (100 ns, offset 15)
8689MB (17796077 512 byte sectors)

Initially I had FAILSAFE commented out of my config. Yeilded a 
spectacular semi-hang which was able to produce the following message 
and let me log in on a couple of vty's before finally totally failing 
to respond:

swap_pager: indefinite wait buffer: device 197641, blkno 1472 size 8192
... a repeating list with 7 different block numbers and 6 unique sizes 
ranging from 4096 to 28672.

I posted this on 3/20 to -stable and didn't get any replies. A good 
start in diagnosis would be, "what is device 197641?"

Other hangs have happened since but none but the first yeilded any
messages. Every time I had kernel pppd running, Netscape 3.01 (US
version manually replaced export version installed via ports), and 
"cd /usr && cvs -q update src" running.

Am not certian every time, but most times at least the CVS repository 
partition was mounted async. Async was added after the fact with "mount 
-u -o async /r/usr". At least once I had also applied async to my /usr
partition containing /usr/src.

Have not been able to reproduce the problem at will. Have recently 
removed the swap partition from the 2G disk in /etc/fstab in order to 
see if that has any effect (I don't know which is device 197641)

--
David Kelly N4HHE, dkelly@nospam.hiwaay.net
=====================================================================
The human mind ordinarily operates at only ten percent of its
capacity -- the rest is overhead for the operating system.



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803300308.VAA03628>