From owner-freebsd-current@FreeBSD.ORG Wed Aug 20 14:31:46 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 514D216A4BF for ; Wed, 20 Aug 2003 14:31:46 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 69F8B43FCB for ; Wed, 20 Aug 2003 14:31:45 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h7KLVerO051235; Wed, 20 Aug 2003 17:31:40 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h7KLVdSN051232; Wed, 20 Aug 2003 17:31:40 -0400 (EDT) Date: Wed, 20 Aug 2003 17:31:39 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Gavin Atkinson In-Reply-To: <20030820164153.S21216-100000@ury.york.ac.uk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: kuriyama@imgsrc.co.jp cc: current@freebsd.org Subject: Re: sysinstall spec_getpages panic (with VM overtones) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Aug 2003 21:31:46 -0000 On Wed, 20 Aug 2003, Gavin Atkinson wrote: > On the 8th August kuriyama@imgsrc.co.jp mentioned he was getting a panic > with FreeBSD inside VMware where _mtx_lock is being called with a NULL > mutex from spec_getpages. I'm also seeing this, 100% reproducible, on > real hardware. (see message ID XFMail.20030808154731.jhb@FreeBSD.org for > the original posters email and jhb's reply) For me, Sysinstall panics > during the extraction of the base package: > > (note that I do not get to see a register dump) kernel: type 12 trap, > code=0 > > _mtx_lock_flags(0,0,c0529513,300,ffffffff) at _mtx_lock_flags+0x43 > spec_getpages(cce33b3c,54,0,cce33b2c,0) at spec_getpages+0x26c > ffs_getpages(cce33b80,0,c05459de,274,c05c63e0) at ffs_getpages+0x5f6 > vnode_pager_getpages(c0bebafc,cce33c70,1,0,cce33c20) at > vnode_pager_getpages+0x73 vm_fault(c1259900,819b000,1,0,c12534c0) at > vm_fault+0x8e2 trap_pfault(cce33d48,1,819b004,200,819b004) at > trap_pfault+0x109 trap(2f,2f,2f,82e533c,0) at trap+0x1fc calltrap() at > calltrap+0x5 I've been getting similar reports locally from our trustedbsd_sebsd branch. We thought originally it was a local merge problem we introduced due to some inconsistent merging of specfs changes, but I think we have now have eliminated that. I suppose I'm relieved... (?) > I first noticed this with the 20030811 JPSNAP, but have tried with the > 9th July 2003 JPSNAP, and yesterdays snapshot, and see the same result > on both. I see the same panic whether installing over the net or from > CD. With 64 meg of ram, it panics half way through the read the chunks > that make up the base package, upping the ram to 256 allows it to read > all of the chunks before panicing. Sounds identical. > *c0529513 = "/usr/src/sys/fs/specfs/spec_vnops.c", line 0x300 is line 768: > > 766 gotreqpage = 0; > 767 VM_OBJECT_LOCK(vp->v_object); > 768 vm_page_lock_queues(); > 769 for (i = 0, toff = 0; i < pcount; i++, toff = nextoff) { > > so ap->a_vp is null. I'#m afraid that's the limit of my ddb ability. > > Any suggestions as to where I should go from here? I don't really have > the facility at the moment to make release to test patches but will try > to if necessary. Is it ap->a_vp that's NULL, or vp->v_object that's NULL? vp is dereferenced several times before that in the code, so if vp is really NULL at line 767, we're probably talking about memory corruption. But if vp->v_object is NULL, then it could be we're not creating a VM object along some code path. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories