From owner-freebsd-hackers Tue May 14 17:43:46 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from ns.aus.com (adsl-64-175-245-157.dsl.sntc01.pacbell.net [64.175.245.157]) by hub.freebsd.org (Postfix) with ESMTP id 2342A37B406 for ; Tue, 14 May 2002 17:43:25 -0700 (PDT) Received: from localhost (rsharpe@localhost) by ns.aus.com (8.11.6/8.11.6) with ESMTP id g4F1qjF05479; Wed, 15 May 2002 11:22:45 +0930 Date: Wed, 15 May 2002 11:22:45 +0930 (CST) From: Richard Sharpe To: Terry Lambert Cc: Subject: Re: File locking, closes and performance in a distributed file systemenv In-Reply-To: <3CE1A9A0.A15C42B9@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 14 May 2002, Terry Lambert wrote: Hmmm, I wasn't very clear ... What I am proposing is a 'simple' fix that simply changes p->p_flag |= P_ADVLOCK; to fp->l_flag |= P_ADVLOCK; And never resets it, and then in closef, if ((fp->l_flag & P_ADVLOCK) && fp->f_type == DTYPE_VNODE) { lf.l_whence = SEEK_SET; lf.l_start = 0; lf.l_len = 0; lf.l_type = F_UNLCK; vp = (struct vnode *)fp->f_data; (void) VOP_ADVLOCK(vp, (caddr_t)p->p_leader, F_UNLCK, &lf, F_POSIX); } Which still means that the correct functionality is implemented, but we only try to unlock files that have ever been locked before, or where we are sharing a file struct with another (related) process and one of them has locked the file. > Richard Sharpe wrote: > > I might be way off base here, but we have run into what looks like a > > performance issue with locking and file closes. > > [ ... ] > > > This seems to mean that once a process locks a file, every close after > > that will pay the penalty of calling the underlying vnode unlock call. In > > a distributed file system, with a simple implementation, that could be an > > RPC to the lock manager to implement. > > Yes. This is pretty much required by the POSIX locking > semantics, which require that the first close remove all > locks. Unfortunately, you can't know on a per process > basis that there are no locks remaining on *any* vnode for > a given process, so the overhead is sticky. > > > Now, there seems to be a few ways to migitate this: > > > > 1. Keep (more) state at the vnode layer that allows us to not issue a > > network traversing unlock if the file was not locked. This means that any > > process that has opened the file will have to issue the network traversing > > unlock request once the flag is set on the vnode. > > > > 2. Place a flag in the struct file structure that keeps the state of any > > locks on the file. This means that any processes that share the struct > > (those related by fork) will need to issue unlock requests if one of them > > locks the file. > > > > 3. Change a file descriptor table that hangs off the process structure so > > that it includes state about whether or not this process has locked the > > file. > > > > It seems that each of these reduces the performance penalty that processes > > that might be sharing the file, but which have not locked the file, might > > have to pay. > > > > Option 2 looks easy. > > > > Are there any comments? > > #3 is really unreasonable. It implies non-coelescing. I know that > CIFS requires this, and so does NFSv4, so it's not an unreasonable > thing do do eventually (historical behaviour can be maintained by > removing all locks in the overlap region on an unlock, yielding > logical coelescing). The amount of things that will need to be > touched by this, though, means it's probably not worth doing now. > > In reality, for remote FS's, you want to assert the lock locally > before transitting the network anyway, in case there is a local > conflict, in which case you avoid propagating the request over > the network. For union mounts of local and remote FS's, for > which there is a local lock against the local FS by another process > that doesn't respect the union (a legitimate thing to have happen), > it's actually a requirement, since the remote system may promote > or coelesce locks, and that means that there is no reverse process > for a remote success followed by a local failure. > > This is basically a twist on #1: > > a) Assert the lock locally before asserting it remotely; > if the assertion fails, then you have avoided a network > operation which is doomed to failure (the RPC call you > are trying to avoid is similar). > > b) When unlocking, verify that the lock exists locally > before attempting to deassert it remotely. This means > there there is still the same local overhead as there > always was, but at least you avoid the RPC in the case > where there are no outstanding locks that will be > cleared by the call. > > I've actually wanted the VOP_ADVLOCK to be veto-based for going > on 6 years now, to avoid precisely the type of problems your are > now facing. If the upper layer code did local assertion on vnodes, > and called the lower layer code only in the success cases, then the > implementation would actually be done for you already. > > -- Terry > -- Regards ----- Richard Sharpe, rsharpe@ns.aus.com, rsharpe@samba.org, sharpe@ethereal.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message