From owner-freebsd-hackers  Tue May 14 17:43:46 2002
Delivered-To: freebsd-hackers@freebsd.org
Received: from ns.aus.com (adsl-64-175-245-157.dsl.sntc01.pacbell.net [64.175.245.157])
	by hub.freebsd.org (Postfix) with ESMTP id 2342A37B406
	for <freebsd-hackers@freebsd.org>; Tue, 14 May 2002 17:43:25 -0700 (PDT)
Received: from localhost (rsharpe@localhost)
	by ns.aus.com (8.11.6/8.11.6) with ESMTP id g4F1qjF05479;
	Wed, 15 May 2002 11:22:45 +0930
Date: Wed, 15 May 2002 11:22:45 +0930 (CST)
From: Richard Sharpe <rsharpe@ns.aus.com>
To: Terry Lambert <tlambert2@mindspring.com>
Cc: <freebsd-hackers@freebsd.org>
Subject: Re: File locking, closes and performance in a distributed file
 systemenv
In-Reply-To: <3CE1A9A0.A15C42B9@mindspring.com>
Message-ID: <Pine.LNX.4.33.0205151117270.5078-100000@ns.aus.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-hackers.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-hackers>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-hackers>
X-Loop: FreeBSD.ORG

On Tue, 14 May 2002, Terry Lambert wrote:

Hmmm, I wasn't very clear ...

What I am proposing is a 'simple' fix that simply changes 

        p->p_flag |= P_ADVLOCK;

to

        fp->l_flag |= P_ADVLOCK;

And never resets it, and then in closef,

        if ((fp->l_flag & P_ADVLOCK) && fp->f_type == DTYPE_VNODE) {
                lf.l_whence = SEEK_SET;
                lf.l_start = 0;
                lf.l_len = 0;
                lf.l_type = F_UNLCK;
                vp = (struct vnode *)fp->f_data;
                (void) VOP_ADVLOCK(vp, (caddr_t)p->p_leader, F_UNLCK, &lf, 
F_POSIX);
        }

Which still means that the correct functionality is implemented, but we 
only try to unlock files that have ever been locked before, or where we 
are sharing a file struct with another (related) process and one of them 
has locked the file.


> Richard Sharpe wrote:
> > I might be way off base here, but we have run into what looks like a
> > performance issue with locking and file closes.
> 
> [ ... ]
> 
> > This seems to mean that once a process locks a file, every close after
> > that will pay the penalty of calling the underlying vnode unlock call. In
> > a distributed file system, with a simple implementation, that could be an
> > RPC to the lock manager to implement.
> 
> Yes.  This is pretty much required by the POSIX locking
> semantics, which require that the first close remove all
> locks.  Unfortunately, you can't know on a per process
> basis that there are no locks remaining on *any* vnode for
> a given process, so the overhead is sticky.
> 
> > Now, there seems to be a few ways to migitate this:
> > 
> > 1. Keep (more) state at the vnode layer that allows us to not issue a
> > network traversing unlock if the file was not locked. This means that any
> > process that has opened the file will have to issue the network traversing
> > unlock request once the flag is set on the vnode.
> > 
> > 2. Place a flag in the struct file structure that keeps the state of any
> > locks on the file. This means that any processes that share the struct
> > (those related by fork) will need to issue unlock requests if one of them
> > locks the file.
> > 
> > 3. Change a file descriptor table that hangs off the process structure so
> > that it includes state about whether or not this process has locked the
> > file.
> > 
> > It seems that each of these reduces the performance penalty that processes
> > that might be sharing the file, but which have not locked the file, might
> > have to pay.
> > 
> > Option 2 looks easy.
> > 
> > Are there any comments?
> 
> #3 is really unreasonable.  It implies non-coelescing.  I know that
> CIFS requires this, and so does NFSv4, so it's not an unreasonable
> thing do do eventually (historical behaviour can be maintained by
> removing all locks in the overlap region on an unlock, yielding
> logical coelescing).  The amount of things that will need to be
> touched by this, though, means it's probably not worth doing now.
> 
> In reality, for remote FS's, you want to assert the lock locally
> before transitting the network anyway, in case there is a local
> conflict, in which case you avoid propagating the request over
> the network.  For union mounts of local and remote FS's, for
> which there is a local lock against the local FS by another process
> that doesn't respect the union (a legitimate thing to have happen),
> it's actually a requirement, since the remote system may promote
> or coelesce locks, and that means that there is no reverse process
> for a remote success followed by a local failure.
> 
> This is basically a twist on #1:
> 
> a)	Assert the lock locally before asserting it remotely;
> 	if the assertion fails, then you have avoided a network
> 	operation which is doomed to failure (the RPC call you
> 	are trying to avoid is similar).
> 
> b)	When unlocking, verify that the lock exists locally
> 	before attempting to deassert it remotely.  This means
> 	there there is still the same local overhead as there
> 	always was, but at least you avoid the RPC in the case
> 	where there are no outstanding locks that will be
> 	cleared by the call.
> 
> I've actually wanted the VOP_ADVLOCK to be veto-based for going
> on 6 years now, to avoid precisely the type of problems your are
> now facing.  If the upper layer code did local assertion on vnodes,
> and called the lower layer code only in the success cases, then the
> implementation would actually be done for you already.
> 
> -- Terry
> 

-- 
Regards
-----
Richard Sharpe, rsharpe@ns.aus.com, rsharpe@samba.org, 
sharpe@ethereal.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message