Date: Tue, 27 Aug 2013 10:59:27 +0200 From: Davide Italiano <davide@freebsd.org> To: Andriy Gapon <avg@freebsd.org> Cc: src-committers@freebsd.org, Xin LI <d@delphij.net>, svn-src-all@freebsd.org, Xin LI <delphij@freebsd.org>, svn-src-head@freebsd.org, Konstantin Belousov <kib@freebsd.org>, Xin Li <delphij@delphij.net> Subject: Re: svn commit: r254585 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <CACYV=-G0i3%2BU3ubWnPSGQpEvKsuJShYJZOMvKQnjBA1aLftJsA@mail.gmail.com> In-Reply-To: <521C5CAC.2060400@FreeBSD.org> References: <201308202231.r7KMVERi068300@svn.freebsd.org> <20130825221517.GM24767@caravan.chchile.org> <521B75CE.70103@FreeBSD.org> <521BDEAC.9080909@delphij.net> <521C5CAC.2060400@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 27, 2013 at 10:00 AM, Andriy Gapon <avg@freebsd.org> wrote: > on 27/08/2013 02:03 Xin Li said the following: >> On 08/26/13 08:35, Andriy Gapon wrote: >>> on 26/08/2013 01:15 Jeremie Le Hen said the following: >>>> Hi Xin, >>>> >>>> On Tue, Aug 20, 2013 at 10:31:14PM +0000, Xin LI wrote: > [snip] >>>>> @@ zfs_rename(vnode_t *sdvp, char *snm, vno if >>>>> (VOP_REALVP(tdvp, &realvp, ct) == 0) tdvp = realvp; >>>>> >>>>> - if (tdvp->v_vfsp != sdvp->v_vfsp || zfsctl_is_node(tdvp)) { + >>>>> tdzp = VTOZ(tdvp); >> >>> The problem with this change, at least on FreeBSD, is that tdvp may >>> not belong to ZFS. In that case VTOZ(tdvp) does not make any sense >>> and would produce garbage or trigger an assert. Previously >>> tdvp->v_vfsp != sdvp->v_vfsp check would catch that situations. Two >>> side notes: - v_vfsp is actually v_mount on FreeBSD >> >> Ah that's good point. Any objection in putting a change to their >> _freebsd_ counterpart (zfs_freebsd_rename and zfs_freebsd_link) as >> attached? > > I think that at least the change to zfs_freebsd_rename as it is now would break > locking and reference counting semantics for the vnodes involved -- vreles and > vputs have to be done even in the error case. > > Also, look at the check at the start of ufs_rename, it also considers tvp when > it's not NULL. I am not sure if it is possible to have a situation where fvp > and tdvp are from the same fs, but tvp is from a different one. > > I've been also tempted to place the check in the common code in kern_renameat. > That should cover the system calls. But could be insufficient for direct calls > to VOP_RENAME in other places. > > diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c > index a7cb87a..cfa4d93 100644 > --- a/sys/kern/vfs_syscalls.c > +++ b/sys/kern/vfs_syscalls.c > @@ -3608,6 +3608,14 @@ kern_renameat(struct thread *td, int oldfd, char *old, > int newfd, char *new, > error = EINVAL; > goto out; > } > + > + /* Check for cross-device rename. */ > + if ((fvp->v_mount != tdvp->v_mount) || > + (tvp && (fvp->v_mount != tvp->v_mount))) { > + error = EXDEV; > + goto out; > + } > + > /* > * If the source is the same as the destination (that is, if they > * are links to the same vnode), then there is nothing to do. > >>> - VOP_REALVP is a glorified nop on FreeBSD >> >> It's not clear to me what was the intention for having a macro instead >> of just ifdef'ing the code out -- maybe to prevent unwanted conflict >> with upstream? These two VOP's are the only consumers of VOP_REALVP >> in tree. > > Yes. Personally I would just ifdef out the calls. > >>> Another unrelated problem that existed before this change and has >>> been noted by Davide Italiano is that sdvp is not locked and so it >>> can potentially be recycled before ZFS_ENTER() enter and for that >>> reason sdzp and zfsvfs could be invalid. Because sdvp is >>> referenced, this problem can currently occur only if a forced >>> unmount runs concurrently to zfs_rename. tdvp should be locked and >>> thus could be used instead of sdvp as a source for zfsvfs. >> >> I think this would need more work, I'll take a look after we have this >> regression fixed. > > Thank you. > > -- > Andriy Gapon I've written a patch that attempts to fix the second problem. http://people.freebsd.org/~davide/review/zfsrename_recycle.diff The culprit is that we're dereferencing sdvp without the vnode lock held, so data-stability is not guaranteed. tdvp, instead, is locked at the entry point of VOP_RENAME() (at least according to vop_rename_pre()) so it can be safely used. As pointed out by Andriy ZFS has some different mechanisms wrt other file systems that allow the vnode to not be recycled when it's referenced (ZFS_ENTER), so once we get zfsvfs_t * from tdzp we can call ZFS_ENTER and dereference sdvp in a safe manner. Thanks, -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACYV=-G0i3%2BU3ubWnPSGQpEvKsuJShYJZOMvKQnjBA1aLftJsA>