Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Mar 1998 20:01:40 -0600
From:      David Kelly <dkelly@hiwaay.net>
To:        freebsd-stable@FreeBSD.ORG
Subject:   swap_pager: indefinite wait buffer...
Message-ID:  <199803210201.UAA03850@nospam.hiwaay.net>

next in thread | raw e-mail | index | archive | help
Last night I had the most spectacular FreeBSD failure I've seen in 3
years. After about 40 days uptime, exmh2, netscape 3.01, a kernel pppd
connection, and a "cvs update" running the following message (retyped,
not copied) started spewing out:

swap_pager: indefinite wait buffer: device 197641, blkno 1472 size 8192
... a repeating list with 7 different block numbers and 6 unique sizes 
ranging from 4096 to 28672.

System allowed me to login a couple of times on vty's before blocking. 
CTL-ALT-DEL wouldn't reboot, a hard reset was required. Could flip 
betwen the vty's with ease, even into the X vty. And in X I forget if I 
could move windows around or not, think I could. Just couldn't do 
anything inside one. Pretty sure I could raise some to the front, just 
nothing inside the window got redrawn, only the boarders. Using twm and 
the Mach32 driver.

System is a PPro-166/512k overclocked to 200 MHz. Has been running solid
overclocked the past 5 months. Ran it a half hour at 233 initially (no
problems, built the kernel several times while running rc564) just to
see what the limit was before backing down. MB is an Asus P6NP5.

nospam: {1004} swapinfo
Device      1K-blocks     Used    Avail Capacity  Type
/dev/sd0s2b     65536       56    65416     0%    Interleaved
/dev/sd1s2b    131072       32   130976     0%    Interleaved
Total          196480       88   196392     0%
nospam: {1005} uname -a
FreeBSD nospam.hiwaay.net 2.2.5-STABLE FreeBSD 2.2.5-STABLE #0: Wed Feb 11 23:26:40 CST 1998     root@nospam.hiwaay.net:/usr/src/sys/compile/PPRO166  i386
nospam: {1006} 

The kernel was built shortly after a "make world", which was done 
shortly after updating my source tree via cvsup.

dmesg says:

ahc0 <Adaptec 2940 SCSI host adapter> rev 0 int a irq 9 on pci0:9
ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs
ahc0 waiting for scsi devices to settle
(ahc0:0:0): "SEAGATE ST32550N 0021" type 0 fixed SCSI 2
sd0(ahc0:0:0): Direct-Access 2047MB (4194058 512 byte sectors)

ncr0 <ncr 53c875 fast20 wide scsi> rev 3 int a irq 11 on pci0:11
ncr0 waiting for scsi devices to settle
(ncr0:0:0): WIDE SCSI (16 bit) enabled(ncr0:0:0): 10.0 MB/s (200 ns, offset 15)
(ncr0:0:0): "IBM OEM DCHS09W 2222" type 0 fixed SCSI 2
sd1(ncr0:0:0): Direct-Access 
sd1(ncr0:0:0): WIDE SCSI (16 bit) enabled
sd1(ncr0:0:0): 20.0 MB/s (100 ns, offset 15)
8689MB (17796077 512 byte sectors)

I have 64M of swap on the Adaptec 2G Seagate and 128M on the '875 9G
IBM. Not sure what the swap status was prior to the problem but the 
month prior there was a fairly constant 6M swapped.

When the problem happened, "cvs update" was copying from /home/ncvs on 
the 9G drive and writing to /usr/src on the 2G drive. The source 
partition was mounted async.

None of AHC_TAGENABLE, AHC_SCBPAGING_ENABLE, or AHC_ALLOW_MEMIO are 
enabled in my kernel. However FAILSAFE *is* commented out and 
apparently enables the equiv of AHC_TAGENABLE for the NCR driver.

So which device is "197641"?

A quick read-only scan of the disk blocks didn't yeild any problems. 
But that doesn't really mean anything.

Is this a fluke? Should I worry about it or is there something I should 
be doing to prevent it from happening again?

--
David Kelly N4HHE, dkelly@nospam.hiwaay.net
=====================================================================
The human mind ordinarily operates at only ten percent of its
capacity -- the rest is overhead for the operating system.



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803210201.UAA03850>