From owner-freebsd-current Sun Mar 5 18:24:44 1995 Return-Path: current-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id SAA18128 for current-outgoing; Sun, 5 Mar 1995 18:24:44 -0800 Received: from sbstark.cs.sunysb.edu (sbstark.cs.sunysb.edu [130.245.1.47]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id SAA18122 for ; Sun, 5 Mar 1995 18:24:43 -0800 Received: from starkhome.UUCP (root@localhost) by sbstark.cs.sunysb.edu (8.6.9/8.6.9) with UUCP id VAA17917 for current@freebsd.org; Sun, 5 Mar 1995 21:24:32 -0500 Received: by starkhome.cs.sunysb.edu (8.6.10/1.34) id VAA01305; Sun, 5 Mar 1995 21:20:14 -0500 Date: Sun, 5 Mar 1995 21:20:14 -0500 From: starkhome!gene@sbstark.cs.sunysb.edu (Gene Stark) Message-Id: <199503060220.VAA01305@starkhome.cs.sunysb.edu> To: current@FreeBSD.org Subject: Page fault panics during make world in -current Sender: current-owner@FreeBSD.org Precedence: bulk As I mentioned in a previous message to "current", I am getting page fault panics with a -current kernel during a make world. Specifically, this seems to occur during the library install phase. I have also had problems during shutdown and reboot, though I don't know if it's the same thing. I spent a couple of hours looking over a crash dump just now, and here is what I am seeing. A call originates from vm_fault_additional_pages() via vnode_pager_haspage() to perform VM paging I/O on a file at offset 0x24000, which with a filesystem bsize of 8192 corresponds to logical block number 18. This being beyond the number of direct blocks (12) in the file, the first indirect block is required. This indirect block has logical block number -12, according to the coding scheme being used for the file metadata. The call goes through ufs_bmap() to ufs_bmaparray(), where it is determined that the desired data lives at disk address 65896, so a call is made to getblk() at line 180. In getblk(), the flag "doingvmio" is set to 1, and a call is made to allocbuf(), which apparently then goes bananas, eventually triggering a panic, which may actually occur from with the call to vm_page_lookup() at line 1005. It appears to me that the problem is the negative logical block number is filtering down to the call to allocbuf(). From the way the logical block numbers are used in allocbuf() to compute offsets into VM objects (which are unsigned, at least they used to be, and it is hard to imagine things changing that substantially), it looks like allocbuf() is not supposed to be called with a negative logical block numbers and vmio=1. Is this right? My current theory is that somehow the setting of doingvmio to 1 in getblk() is wrong, or some sort of synchronization problem is causing this setting to be inappropriate before the call to allocbuf() is made. It could be associated from paging from/to a shared library that is being removed or updated. I can provide more information from the crash dump, if that will help. - Gene