From owner-freebsd-current@FreeBSD.ORG Sun Nov 30 18:35:57 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 16E7716A4CE for ; Sun, 30 Nov 2003 18:35:57 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 82B2343FE1 for ; Sun, 30 Nov 2003 18:35:54 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hB12ZieF025215; Sun, 30 Nov 2003 18:35:48 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200312010235.hB12ZieF025215@gw.catspoiler.org> Date: Sun, 30 Nov 2003 18:35:44 -0800 (PST) From: Don Lewis To: shoesoft@gmx.net In-Reply-To: <1070238961.5827.17.camel@shoeserv.freebsd> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: current@FreeBSD.org Subject: Re: 5.2-BETA panic: page fault X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Dec 2003 02:35:57 -0000 On 1 Dec, Stefan Ehmann wrote: > On Mon, 2003-12-01 at 01:10, Don Lewis wrote: >> Can you reproduce this problem without bktr? >> > >> You are getting a double panic, with the second happening during the >> file system sync. The code seems to be be tripping over the same mount >> list entry each time. Maybe the mount list is getting corrupted. Are >> you using amd? Print *lkp in the lockmgr() stack frame. >> >> >> You might want to add >> KASSERT(mp->mnt_lock.lk_interlock !=NULL, "vfs_busy: NULL mount >> pointer interlock"); >> at the top of vfs_busy() and right before the lockmgr() call. > > No, I'm not using amd. > > (kgdb) print *lkp > $1 = {lk_interlock = 0x0, lk_flags = 0, lk_sharecount = 0, lk_waitcount > = 0, > lk_exclusivecount = 0, lk_prio = 0, lk_wmesg = 0x0, lk_timo = 0, > lk_lockholder = 0x0, lk_newlock = 0x0} > > This is indeed just NULLs. Not good. Nothing should be writing to lk_interlock once it has been initialized. Either something is stomping on an active struct mount, we're still using it after it has been put on the free list, or dp->v_mountedhere is pointing somewhere bogus. I don't suspect the latter because the second panic() in the sync() code doesn't follow this path to get to the struct mount. > I haven't tried without bktr yet but I hope I'll have time for that (and > the KASSERT) tomorrow. > > The panic only seems to happen when accessing my read-only mounted ext2 > partition. Today I tried not to access any data there and uptime is > 14h30min now. The panic always happened after a few hours. So this is > probably the core of the problem. That sounds like a possibility. I might be able to try that here when I have some idle time on my -CURRENT box. Can you print *dp->v_mountedhere in the lookup() frame? That should show the mount point information and might show if anything else in struct mount is damaged.