Date: Wed, 6 Jul 2005 16:09:51 -0400 From: Suleiman Souhlal <refugee@astraldream.net> To: Bruce Evans <bde@zeta.org.au> Cc: freebsd-fs@freebsd.org, ken@freebsd.org, Stephan Uphoff <ups@tree.com> Subject: Re: [saturnero@freesbie.org: Weird behaviour of mount_unionfs with executables] Message-ID: <D188AED6-81E9-4EAD-8EB5-991489EB99BD@astraldream.net> In-Reply-To: <20050705030015.N790@epsplex.bde.org> References: <20050703181616.GC89744@cvs.freesbie.org> <42C83643.4010506@samsco.org> <20050703201621.GD89744@cvs.freesbie.org> <1120425831.77984.37993.camel@palm> <42C87CAE.7080802@samsco.org> <1120436351.77984.38195.camel@palm> <42C88121.8010602@samsco.org> <1120437780.77984.38252.camel@palm> <20050704095431.GA31030@cvs.freesbie.org> <20050705030015.N790@epsplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, On Jul 4, 2005, at 2:52 PM, Bruce Evans wrote: > On Mon, 4 Jul 2005, Dario Freni wrote: > > >> On Sun, Jul 03, 2005 at 08:43:01PM -0400, Stephan Uphoff wrote: >> >>> YES - and it looks like just specifying noatime for the union mount >>> should fix the copy problem for FreeSBIE. >>> >> >> Correct. Using -o noatime when mounting the unionfs resolves all >> above >> problems. Thank you very much. >> > > Hmm. This shouldn't work, but helps avoid the copy due to a layering > violation. In kern_execve(), the test of MNT_NOATIME is supposed to > be just an optimization to avoid making null VOP_SETATTR(), but it > only looks at the mount flag for the top layer so it doesn't work > right if the mount flag is different in lower layers. This gives the > bizarre behaviour that the noatime option for unionsfs "works" for > exec() but not for read(). > > I could only see the duplication problem for mount_unionfs -b. There > are lots of bugs in that case: > = no copy is made for the modification caused by read(), despite > mount_unionfs(8) claiming that: > > Any other operation which would ultimately require > modification to the > ^^^ > lower layer fails with EROFS. > > - the error for failure is often not EROFS (but this may be a side > effect). > - exec() makes a bad copy. It fails to copy the access times and > doesn't > even copy the permissions; in particular, it loses the x bits so the > file becomes non-executable. > - touch(1) makes a bad copy like exec() and then fails to actually > touch > the file except for the clobbered times in the non-copy being > current. > (touch tries utimes(2) twice, first with the current times and > then with > a null times pointer. Both calls fail with EACCES. I think a > bad copy > is made on the first call ...). Subsequent touch(1)'s work normally > (... touch tries utimes(2) only once, only with the current > times, and > since this works there seems to be a race for the earlier calls > to fail > (I tested as root, so there should be no permissions problems > with either > call, and since the middle utimes(path, NULL) call failed the bug > can't > be just a stale vp after making the copy in the first call).) > > The comment in union_setattr() only claims to make a copy to handle > truncation, but a copy seems to be made when any attribute is set: > > % static int > % union_setattr(ap) > % ... > % /* > % * Handle case of truncating lower object to zero size > % * by creating a zero length upper object. This is to > % * handle the case of open with O_TRUNC and O_CREAT. > % */ > % if (un->un_uppervp == NULLVP && (un->un_lowervp->v_type == > VREG)) { > % error = union_copyup(un, (ap->a_vap->va_size != 0), > % ap->a_cred, ap->a_td); > % if (error) > % return (error); > % } > > Copying on any change to an attribute seems right, but it doesn't > actually handle implicit changes that don't go through setattr -- > mainly setting of atimes on read(). Copying on read() wouldn't be > right. > > % % /* > % * Try to set attributes in upper layer, > % * otherwise return read-only filesystem error. > % */ > % error = EROFS; > % if ((uppervp = union_lock_upper(un, td)) != NULLVP) { > % error = VOP_SETATTR(un->un_uppervp, ap->a_vap, > % ap->a_cred, ap->a_td); > % if ((error == 0) && (ap->a_vap->va_size != VNOVAL)) > % union_newsize(ap->a_vp, ap->a_vap->va_size, VNOVAL); > % union_unlock_upper(uppervp, td); > % } > % return (error); > % } > > The second clause in the comment doesn't match the code here -- the > error > is very rarely EROFS since it is usually the error returned by > VOP_SETATTR() > and we have avoided getting this far in most cases where VOP_SETATTR() > would return EROFS. The comment was closer to matching the code in > rev.1.1 > where decideding EROFS was left to lower layers. > > Bruce The vnode locking is also wrong for several cases: - According sys/kern/vnode_if.src we should return from VOP_OPEN() with vp locked, but union_open() unlocks it before returning it. - Same thing in union_access(). - We are not making sure we are calling VOP_CLOSE() with the vnode exclusively locked in union_close(). - We don't lock the upper and lower vnodes when doing VOP_GETATTR() on them in union_getattr(). - We don't lock othervp when doing VOP_STRATEGY() in union_strategy(). I think there are also other problems, but haven't had time to look for them yet.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D188AED6-81E9-4EAD-8EB5-991489EB99BD>