From owner-freebsd-current Sun Jun 27 15:47:20 1999 Delivered-To: freebsd-current@freebsd.org Received: from overcee.netplex.com.au (overcee.netplex.com.au [202.12.86.7]) by hub.freebsd.org (Postfix) with ESMTP id C36EC151D3 for ; Sun, 27 Jun 1999 15:47:14 -0700 (PDT) (envelope-from peter@netplex.com.au) Received: from netplex.com.au (localhost [127.0.0.1]) by overcee.netplex.com.au (Postfix) with ESMTP id A61D781; Mon, 28 Jun 1999 06:47:13 +0800 (WST) (envelope-from peter@netplex.com.au) X-Mailer: exmh version 2.0.2 2/24/98 To: Matthew Dillon Cc: current@FreeBSD.ORG, mckusick@mckusick.com Subject: Re: BUF_LOCK() related panic.. In-reply-to: Your message of "Sun, 27 Jun 1999 13:06:13 MST." <199906272006.NAA15499@apollo.backplane.com> Date: Mon, 28 Jun 1999 06:47:13 +0800 From: Peter Wemm Message-Id: <19990627224713.A61D781@overcee.netplex.com.au> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Matthew Dillon wrote: > : > :But that doesn't fix the UP problem where cluster_wbuild() tries to > :recursively re-lock a buf that the current process already owns. I have a > :few ideas about that one though, I just don't understand the clustering > :well enough yet to fix it. > > Ok, I just hit this testing the lockmgr changes. > > I think the problem is that cluster_wbuild's algorithm was polluted > a little by Kirk's commit. > > Previously it tested for B_BUSY to determine if it could then lock it > to include in the cluster. Yep, I was aware of this before, but I didn't rip it out since I didn't know what Kirk's intentions were. I'm assuming he's got his experience with making the BSD/OS vfs reentrant in mind, so I don't want to break anything that gets closer to that. A seperate test and then lock would not be reentrant. (Sure, there are far bigger problems than this, but every bit helps when we get there) > Kirk changed this to actually attempt a lock, and then include it > if the lock succeeded and not include it if the lock failed. > > The problem is that if the buffer was already locked by the same process, > this change results in a panic instead of a simple failure to obtain the > lock. > The solution is to re-tool the code to use the original algorithm ( test > the lock before trying to get it, rather then simply trying to get it ), > but with the new locks. I do not have time today to do this but I believ e > I have given sufficient information for Peter, Kirk, or Alan to make the > fix. Actually, I think there is another set if missing BUF_KERNPROC() calls, cluster_callback() frees buffers, so all buffers submitted with it had better be reassigned. This is (I think) part of the problem that cluster_wbuild() is hitting - things were supposed to have been reassigned but are still hanging onto the current process. > I believe there are two or three areas in Kirk's patchset where he > replaced an explicit test with an attempt to actually gain the lock where > this sort of panic can occur. I think Kirk was trying to optimize the > code :-) Heh heh. Just goes to show that combining functional > replacements with optimizations all in one go does not always work. > > -Matt > Matthew Dillon > > > :Cheers, > :-Peter > :-- > :Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au > > > Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message