Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 03 Oct 2001 12:23:00 +0100
From:      Ian Dowse <iedowse@maths.tcd.ie>
To:        Matt Dillon <dillon@earth.backplane.com>
Cc:        Peter Wemm <peter@wemm.org>, Yevgeniy Aleynikov <eugenea@infospace.com>, ache@FreeBSD.ORG, mckusick@mckusick.com, Ken Pizzini <kenp@infospace.com>, hackers@FreeBSD.ORG
Subject:   Re: patch #3 (was Re: bleh. Re: ufs_rename panic) 
Message-ID:   <200110031223.aa26174@salmon.maths.tcd.ie>
In-Reply-To: Your message of "Tue, 02 Oct 2001 19:24:18 PDT." <200110030224.f932OIO62159@earth.backplane.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
In message <200110030224.f932OIO62159@earth.backplane.com>, Matt Dillon writes:
>
>:This seems rather large compared to Ian Dowse's version..  Are you sure that
>:you're doing this the right way?  Adding a whole new locking mechanism
>:when the simple VRENAME flag to be enough seems like a bit of overkill..

Matt addresses the problem more completely than my patch does, so
the differences in patch size and files touched are to be expected.
In particular, the NFS server and unionfs code need to be changed
in the same way as the syscalls, and the IN_RENAME flag can be
removed from the ufs code, both of which are included in Matt's
patch.

>    Ian's doesn't fix any of the filesystem semantics bugs, it only prevents
>    the panic from occuring.

This is certainly correct, though the IN_RENAME flag in the UFS
code currently has a few such semantics bugs where EINVAL can be
returned in cases that would succeed if rename() was atomic. When
a vnode cannot be renamed/unlinked/rmdir'd because it is being
renamed, the operation should be retried until it succeeds, sleeping
as necessary. As I understand it, this is mostly dealt with by
Matt's patch, but not at all by mine.

>    If you remove the filesystem semantics fixes from my patch you 
>    essentially get Ian's patch except that I integrated the vnode flag
>    in namei/lookup whereas Ian handles it manually in the syscall code.

The addition of the SOFTLOCKLEAF code is quite a major change, so
it would be very useful if you could describe exactly what it does,
what its semantics are, and how it fits into the rename problem.

My understanding of the problem is that VOP_RENAME is quite unique
in that it is the only VOP that must modify entries in two separate
directories. To avoid deadlock, it is not possible (very hard anyway)
to lock all 4 vnodes (source node, source parent, target node,
target parent) before calling VOP_RENAME. Instead, the approach
taken is to lock only the target node and its parent, and have the
VOP_RENAME implementation jump back and forth between locking the
source and locking the target as necessary. Hence VOP_RENAME is
the only VOP that must modify a node that is passed in unlocked.

Because the source node and parent are not locked, there is the
possibility that the source node could be renamed or removed at
any time before VOP_RENAME finally gets around to locking it and
removing it. Something needs to protect the source node against
being renamed/removed between the point that the source node is
initially looked up and the point that it is finally locked. Both
Matt's SOFTLOCKLEAF and the VRENAME flag are there to provide this
protection.

It is the fact that this problem is entirely unique to VOP_RENAME
that leads me to think that adding the generic SOFTLOCKLEAF code
is overkill. The following fragment also suggests that maybe the
approach doesn't actually fit in that well:

	fromnd.ni_cnd.cn_flags &= ~SOFTLOCKLEAF;	/* XXX hack */
	error = VOP_RENAME(fromnd.ni_dvp, fromnd.ni_vp, &fromnd.ni_cnd,
	    tond.ni_dvp, tond.ni_vp, &tond.ni_cnd);
	fromnd.ni_cnd.cn_flags |= SOFTLOCKLEAF;
	NDFREE(&fromnd, NDF_ONLY_PNBUF & NDF_ONLY_SOFTLOCKLEAF);

The way that vclearsoftlock() is used to clear a flag in an unlocked
vnode is also not ideal. This should probably be protected at least
by v_interlock as other flags are.


The syscalls that need to be changed (rename, unlink, rmdir) could
possibly use vn_* style wrapper functions to reduce the amount of
code that must understand the new locking mechanism, although I'm
not sure if this is practical for the NFS case. It might also be
a good time to remove the WILLRELE from VOP_RENAME, which would
simplify some of the surrounding code.

Ian

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200110031223.aa26174>