From owner-freebsd-current@FreeBSD.ORG Sun Aug 24 17:34:33 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D573716A4BF; Sun, 24 Aug 2003 17:34:33 -0700 (PDT) Received: from pump2.york.ac.uk (pump2.york.ac.uk [144.32.128.12]) by mx1.FreeBSD.org (Postfix) with ESMTP id 65FEC43FE1; Sun, 24 Aug 2003 17:34:30 -0700 (PDT) (envelope-from gavin@ury.york.ac.uk) Received: from ury.york.ac.uk (ury.york.ac.uk [144.32.108.81]) by pump2.york.ac.uk (8.12.9/8.12.9) with ESMTP id h7P0YSaC028773; Mon, 25 Aug 2003 01:34:28 +0100 (BST) Received: from ury.york.ac.uk (localhost.york.ac.uk [127.0.0.1]) by ury.york.ac.uk (8.12.8p1/8.12.8) with ESMTP id h7P0YStg023882; Mon, 25 Aug 2003 01:34:28 +0100 (BST) (envelope-from gavin@ury.york.ac.uk) Received: from localhost (gavin@localhost)h7P0YSN4023879; Mon, 25 Aug 2003 01:34:28 +0100 (BST) Date: Mon, 25 Aug 2003 01:34:28 +0100 (BST) From: Gavin Atkinson To: Robert Watson In-Reply-To: Message-ID: <20030825011106.L23215-100000@ury.york.ac.uk> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: kuriyama@imgsrc.co.jp cc: current@freebsd.org Subject: Re: sysinstall spec_getpages panic (with VM overtones) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Aug 2003 00:34:34 -0000 On Wed, 20 Aug 2003, Robert Watson wrote: > On Wed, 20 Aug 2003, Gavin Atkinson wrote: > > _mtx_lock_flags(0,0,c0529513,300,ffffffff) at _mtx_lock_flags+0x43 > > spec_getpages(cce33b3c,54,0,cce33b2c,0) at spec_getpages+0x26c > > ffs_getpages(cce33b80,0,c05459de,274,c05c63e0) at ffs_getpages+0x5f6 > > vnode_pager_getpages(c0bebafc,cce33c70,1,0,cce33c20) at > > vnode_pager_getpages+0x73 vm_fault(c1259900,819b000,1,0,c12534c0) at > > vm_fault+0x8e2 trap_pfault(cce33d48,1,819b004,200,819b004) at > > trap_pfault+0x109 trap(2f,2f,2f,82e533c,0) at trap+0x1fc calltrap() at > > calltrap+0x5 > > > > *c0529513 = "/usr/src/sys/fs/specfs/spec_vnops.c", line 0x300 is line 768: > > > > 766 gotreqpage = 0; > > 767 VM_OBJECT_LOCK(vp->v_object); > > 768 vm_page_lock_queues(); > > 769 for (i = 0, toff = 0; i < pcount; i++, toff = nextoff) { > > Is it ap->a_vp that's NULL, or vp->v_object that's NULL? vp is > dereferenced several times before that in the code, so if vp is really > NULL at line 767, we're probably talking about memory corruption. But if > vp->v_object is NULL, then it could be we're not creating a VM object > along some code path. Although this panic is 100% reproducible during the initial install through sysinstall, I have tried hard but can not reproduce this once the system is installed and running multiuser, even by performing the same actions within sysinstall. I have I have also tried without success to get a crash dump of the panic, however after a fair bit of head scratching it looks from a grep of the source code like the "dumpdev" loader variable documented in loader(8) is not yet implemented... and as far as I can tell there is no other way I can get the installer off CD to generate a dump. I'm trying to make a release with extra debugging info, but won't be able to test this until at least Wednesday or Thursday. What extra debugging info would be useful? Who would be the best person to discuss this with? >From what kuriyama said, it appears that it is indeed vp->v_object that is null, so I have added the following to specfs_vnops.c just before the lock that fails: if (vp->v_object == NULL) panic("vp->v_object is null in %s, rdev=%s", __func__, devtoname(vp->v_rdev)); Hopefully that will help diagnose the cause a little further, but I'm really working blind here - this is not an area of the kernel I understand at all. If there is any other debugging info I can provide that may be useful, I'm happy to have a go. Kuriyama, if you have any spare time before I am able to do it, maybe you could add the above code and find out what message it panics with? Gavin