Date: Sun, 24 Aug 2003 21:04:08 -0400 (EDT) From: Robert Watson <rwatson@freebsd.org> To: Gavin Atkinson <gavin@ury.york.ac.uk> Cc: current@freebsd.org Subject: Re: sysinstall spec_getpages panic (with VM overtones) Message-ID: <Pine.NEB.3.96L.1030824210055.96248A-100000@fledge.watson.org> In-Reply-To: <20030825011106.L23215-100000@ury.york.ac.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 25 Aug 2003, Gavin Atkinson wrote: > On Wed, 20 Aug 2003, Robert Watson wrote: > > On Wed, 20 Aug 2003, Gavin Atkinson wrote: > > > _mtx_lock_flags(0,0,c0529513,300,ffffffff) at _mtx_lock_flags+0x43 > > > spec_getpages(cce33b3c,54,0,cce33b2c,0) at spec_getpages+0x26c > > > ffs_getpages(cce33b80,0,c05459de,274,c05c63e0) at ffs_getpages+0x5f6 > > > vnode_pager_getpages(c0bebafc,cce33c70,1,0,cce33c20) at > > > vnode_pager_getpages+0x73 vm_fault(c1259900,819b000,1,0,c12534c0) at > > > vm_fault+0x8e2 trap_pfault(cce33d48,1,819b004,200,819b004) at > > > trap_pfault+0x109 trap(2f,2f,2f,82e533c,0) at trap+0x1fc calltrap() at > > > calltrap+0x5 > > > > > > *c0529513 = "/usr/src/sys/fs/specfs/spec_vnops.c", line 0x300 is line 768: > > > > > > 766 gotreqpage = 0; > > > 767 VM_OBJECT_LOCK(vp->v_object); > > > 768 vm_page_lock_queues(); > > > 769 for (i = 0, toff = 0; i < pcount; i++, toff = nextoff) { > > > > Is it ap->a_vp that's NULL, or vp->v_object that's NULL? vp is > > dereferenced several times before that in the code, so if vp is really > > NULL at line 767, we're probably talking about memory corruption. But if > > vp->v_object is NULL, then it could be we're not creating a VM object > > along some code path. > > Although this panic is 100% reproducible during the initial install > through sysinstall, I have tried hard but can not reproduce this once > the system is installed and running multiuser, even by performing the > same actions within sysinstall. I have I have also tried without success > to get a crash dump of the panic, however after a fair bit of head > scratching it looks from a grep of the source code like the "dumpdev" > loader variable documented in loader(8) is not yet implemented... and as > far as I can tell there is no other way I can get the installer off CD > to generate a dump. > > I'm trying to make a release with extra debugging info, but won't be > able to test this until at least Wednesday or Thursday. What extra > debugging info would be useful? Who would be the best person to discuss > this with? From what kuriyama said, it appears that it is indeed > vp->v_object that is null, so I have added the following to > specfs_vnops.c just before the lock that fails: > > if (vp->v_object == NULL) > panic("vp->v_object is null in %s, rdev=%s", __func__, > devtoname(vp->v_rdev)); > > Hopefully that will help diagnose the cause a little further, but I'm > really working blind here - this is not an area of the kernel I > understand at all. If there is any other debugging info I can provide > that may be useful, I'm happy to have a go. Kuriyama, if you have any > spare time before I am able to do it, maybe you could add the above code > and find out what message it panics with? Alan Cox just made a commit a couple of days ago that seems to resolve the problem for us. Here's the commit message so you can give it a try. alc 2003/08/22 10:50:32 PDT FreeBSD src repository Modified files: sys/fs/specfs spec_vnops.c Log: Use the requested page's object field instead of the vnode's. In some cases, the vnode's object field is not initialized leading to a NULL pointer dereference when the object is locked. Tested by: rwatson Revision Changes Path 1.208 +5 -2 src/sys/fs/specfs/spec_vnops.c
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1030824210055.96248A-100000>