Date: Wed, 03 Oct 2001 12:23:00 +0100 From: Ian Dowse <iedowse@maths.tcd.ie> To: Matt Dillon <dillon@earth.backplane.com> Cc: Peter Wemm <peter@wemm.org>, Yevgeniy Aleynikov <eugenea@infospace.com>, ache@FreeBSD.ORG, mckusick@mckusick.com, Ken Pizzini <kenp@infospace.com>, hackers@FreeBSD.ORG Subject: Re: patch #3 (was Re: bleh. Re: ufs_rename panic) Message-ID: <200110031223.aa26174@salmon.maths.tcd.ie> In-Reply-To: Your message of "Tue, 02 Oct 2001 19:24:18 PDT." <200110030224.f932OIO62159@earth.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
In message <200110030224.f932OIO62159@earth.backplane.com>, Matt Dillon writes: > >:This seems rather large compared to Ian Dowse's version.. Are you sure that >:you're doing this the right way? Adding a whole new locking mechanism >:when the simple VRENAME flag to be enough seems like a bit of overkill.. Matt addresses the problem more completely than my patch does, so the differences in patch size and files touched are to be expected. In particular, the NFS server and unionfs code need to be changed in the same way as the syscalls, and the IN_RENAME flag can be removed from the ufs code, both of which are included in Matt's patch. > Ian's doesn't fix any of the filesystem semantics bugs, it only prevents > the panic from occuring. This is certainly correct, though the IN_RENAME flag in the UFS code currently has a few such semantics bugs where EINVAL can be returned in cases that would succeed if rename() was atomic. When a vnode cannot be renamed/unlinked/rmdir'd because it is being renamed, the operation should be retried until it succeeds, sleeping as necessary. As I understand it, this is mostly dealt with by Matt's patch, but not at all by mine. > If you remove the filesystem semantics fixes from my patch you > essentially get Ian's patch except that I integrated the vnode flag > in namei/lookup whereas Ian handles it manually in the syscall code. The addition of the SOFTLOCKLEAF code is quite a major change, so it would be very useful if you could describe exactly what it does, what its semantics are, and how it fits into the rename problem. My understanding of the problem is that VOP_RENAME is quite unique in that it is the only VOP that must modify entries in two separate directories. To avoid deadlock, it is not possible (very hard anyway) to lock all 4 vnodes (source node, source parent, target node, target parent) before calling VOP_RENAME. Instead, the approach taken is to lock only the target node and its parent, and have the VOP_RENAME implementation jump back and forth between locking the source and locking the target as necessary. Hence VOP_RENAME is the only VOP that must modify a node that is passed in unlocked. Because the source node and parent are not locked, there is the possibility that the source node could be renamed or removed at any time before VOP_RENAME finally gets around to locking it and removing it. Something needs to protect the source node against being renamed/removed between the point that the source node is initially looked up and the point that it is finally locked. Both Matt's SOFTLOCKLEAF and the VRENAME flag are there to provide this protection. It is the fact that this problem is entirely unique to VOP_RENAME that leads me to think that adding the generic SOFTLOCKLEAF code is overkill. The following fragment also suggests that maybe the approach doesn't actually fit in that well: fromnd.ni_cnd.cn_flags &= ~SOFTLOCKLEAF; /* XXX hack */ error = VOP_RENAME(fromnd.ni_dvp, fromnd.ni_vp, &fromnd.ni_cnd, tond.ni_dvp, tond.ni_vp, &tond.ni_cnd); fromnd.ni_cnd.cn_flags |= SOFTLOCKLEAF; NDFREE(&fromnd, NDF_ONLY_PNBUF & NDF_ONLY_SOFTLOCKLEAF); The way that vclearsoftlock() is used to clear a flag in an unlocked vnode is also not ideal. This should probably be protected at least by v_interlock as other flags are. The syscalls that need to be changed (rename, unlink, rmdir) could possibly use vn_* style wrapper functions to reduce the amount of code that must understand the new locking mechanism, although I'm not sure if this is practical for the NFS case. It might also be a good time to remove the WILLRELE from VOP_RENAME, which would simplify some of the surrounding code. Ian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi? <200110031223.aa26174>