From owner-freebsd-current Mon Sep 2 0:30:46 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 10A6937B401; Mon, 2 Sep 2002 00:30:43 -0700 (PDT) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0882C43E4A; Mon, 2 Sep 2002 00:30:41 -0700 (PDT) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id g81Btl416807; Sun, 1 Sep 2002 07:55:53 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Sun, 1 Sep 2002 07:55:47 -0400 (EDT) From: Jeff Roberson To: bde@zeta.org.au Cc: kris@obsecurity.org, , , Subject: Re: [bde@zeta.org.au: Re: Page faults from bento cluster (Re: Problems reading vmcores)] In-Reply-To: <20020901110330.GL86074@elvis.mu.org> Message-ID: <20020901074935.N9517-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > As near as I can tell the panic is happening in VOP_GETATTR(). It looks > > to me like it would be possible for the vnode to be recycled between the > > time when it passes the vp->v_mount test at the top of the loop and the > > time when vn_lock() succeeds. Shouldn't we bump the vnode reference > > count by calling vref() at the top of the loop and add the appropriate > > calls to vrele()? > > Rev.1.395 made some changes that I didn't like much here. The > VOP_GETATTR() is now done unconditionally. This pessimizes vflush() > and enlarges any race windows. I think WRITECLOSE is only used for > mount -u from rw to ro, so the pessimization exercises code that was > rarely used before. > > Rev.1.394 called VOP_GETATTR() with the interlock held. This was wrong > but probably reduced race windows. The window seems to have been > opened before rev.1.394 by releasing mntvnode_slock before aquiring > the interlock. RELENG_4 doesn't release mntvnode_slock at that point > (it holds both locks across the VOP_GETATTR()). > > Bruce > > I have patches that fix the locking behavior in vflush() in my current VFS smp patch. It's not quite complete but it has most of struct vnode locked down. The patch even moves the getattr back into the conditional path. This may fix the behavior here. Again, this is more than vflush, but I didn't want to seperate that out and test it before going to bed. If this fixes the problem I can commit the relavent part of this patch soon. http://www.chesapeake.net/~jroberson/VFSsmp.patch Cheers, Jeff To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message