From owner-freebsd-current  Sun Jun 27 13: 6:24 1999
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
	by hub.freebsd.org (Postfix) with ESMTP id 8D4DA14DAA
	for <current@FreeBSD.ORG>; Sun, 27 Jun 1999 13:06:21 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id NAA15499;
	Sun, 27 Jun 1999 13:06:13 -0700 (PDT)
	(envelope-from dillon)
Date: Sun, 27 Jun 1999 13:06:13 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199906272006.NAA15499@apollo.backplane.com>
To: Peter Wemm <peter@netplex.com.au>
Cc: current@FreeBSD.ORG, mckusick@mckusick.com
Subject: Re: BUF_LOCK() related panic.. 
References:  <19990627084414.24D4D81@overcee.netplex.com.au>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:
:But that doesn't fix the UP problem where cluster_wbuild() tries to
:recursively re-lock a buf that the current process already owns.  I have a
:few ideas about that one though, I just don't understand the clustering
:well enough yet to fix it.

    Ok, I just hit this testing the lockmgr changes.

    I think the problem is that cluster_wbuild's algorithm was polluted
    a little by Kirk's commit.

    Previously it tested for B_BUSY to determine if it could then lock it
    to include in the cluster.

    Kirk changed this to actually attempt a lock, and then include it
    if the lock succeeded and not include it if the lock failed.

    The problem is that if the buffer was already locked by the same process,
    this change results in a panic instead of a simple failure to obtain the
    lock.

    The solution is to re-tool the code to use the original algorithm ( test
    the lock before trying to get it, rather then simply trying to get it ),
    but with the new locks.  I do not have time today to do this but I believe
    I have given sufficient information for Peter, Kirk, or Alan to make the
    fix.

    I believe there are two or three areas in Kirk's patchset where he 
    replaced an explicit test with an attempt to actually gain the lock where
    this sort of panic can occur.  I think Kirk was trying to optimize the
    code :-)  Heh heh.  Just goes to show that combining functional 
    replacements with optimizations all in one go does not always work.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:Cheers,
:-Peter
:--
:Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message