FreeBSD Mail Archives

Date:      Wed, 6 Jul 2005 16:09:51 -0400
From:      Suleiman Souhlal <refugee@astraldream.net>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        freebsd-fs@freebsd.org, ken@freebsd.org, Stephan Uphoff <ups@tree.com>
Subject:   Re: [saturnero@freesbie.org: Weird behaviour of mount_unionfs with executables]
Message-ID:  <D188AED6-81E9-4EAD-8EB5-991489EB99BD@astraldream.net>
In-Reply-To: <20050705030015.N790@epsplex.bde.org>
References:  <20050703181616.GC89744@cvs.freesbie.org> <42C83643.4010506@samsco.org> <20050703201621.GD89744@cvs.freesbie.org> <1120425831.77984.37993.camel@palm> <42C87CAE.7080802@samsco.org> <1120436351.77984.38195.camel@palm> <42C88121.8010602@samsco.org> <1120437780.77984.38252.camel@palm> <20050704095431.GA31030@cvs.freesbie.org> <20050705030015.N790@epsplex.bde.org>

Hi,

On Jul 4, 2005, at 2:52 PM, Bruce Evans wrote:

> On Mon, 4 Jul 2005, Dario Freni wrote:
>
>
>> On Sun, Jul 03, 2005 at 08:43:01PM -0400, Stephan Uphoff wrote:
>>
>>> YES - and it looks like just specifying noatime for the union mount
>>> should fix the copy problem for FreeSBIE.
>>>
>>
>> Correct. Using -o noatime when mounting the unionfs resolves all  
>> above
>> problems. Thank you very much.
>>
>
> Hmm.  This shouldn't work, but helps avoid the copy due to a layering
> violation.  In kern_execve(), the test of MNT_NOATIME is supposed to
> be just an optimization to avoid making null VOP_SETATTR(), but it
> only looks at the mount flag for the top layer so it doesn't work
> right if the mount flag is different in lower layers.  This gives the
> bizarre behaviour that the noatime option for unionsfs "works" for
> exec() but not for read().
>
> I could only see the duplication problem for mount_unionfs -b.  There
> are lots of bugs in that case:
> = no copy is made for the modification caused by read(), despite
>   mount_unionfs(8) claiming that:
>
>      Any other operation which would ultimately require  
> modification to the
>      ^^^
>      lower layer fails with EROFS.
>
> - the error for failure is often not EROFS (but this may be a side  
> effect).
> - exec() makes a bad copy. It fails to copy the access times and  
> doesn't
>   even copy the permissions; in particular, it loses the x bits so the
>   file becomes non-executable.
> - touch(1) makes a bad copy like exec() and then fails to actually  
> touch
>   the file except for the clobbered times in the non-copy being  
> current.
>   (touch tries utimes(2) twice, first with the current times and  
> then with
>   a null times pointer.  Both calls fail with EACCES.  I think a  
> bad copy
>   is made on the first call ...).  Subsequent touch(1)'s work normally
>   (... touch tries utimes(2) only once, only with the current  
> times, and
>   since this works there seems to be a race for the earlier calls  
> to fail
>   (I tested as root, so there should be no permissions problems  
> with either
>   call, and since the middle utimes(path, NULL) call failed the bug  
> can't
>   be just a stale vp after making the copy in the first call).)
>
> The comment in union_setattr() only claims to make a copy to handle
> truncation, but a copy seems to be made when any attribute is set:
>
> % static int
> % union_setattr(ap)
> % ...
> %     /*
> %      * Handle case of truncating lower object to zero size
> %      * by creating a zero length upper object.  This is to
> %      * handle the case of open with O_TRUNC and O_CREAT.
> %      */
> %     if (un->un_uppervp == NULLVP && (un->un_lowervp->v_type ==  
> VREG)) {
> %         error = union_copyup(un, (ap->a_vap->va_size != 0),
> %                 ap->a_cred, ap->a_td);
> %         if (error)
> %             return (error);
> %     }
>
> Copying on any change to an attribute seems right, but it doesn't
> actually handle implicit changes that don't go through setattr --
> mainly setting of atimes on read().  Copying on read() wouldn't be
> right.
>
> % %     /*
> %      * Try to set attributes in upper layer,
> %      * otherwise return read-only filesystem error.
> %      */
> %     error = EROFS;
> %     if ((uppervp = union_lock_upper(un, td)) != NULLVP) {
> %         error = VOP_SETATTR(un->un_uppervp, ap->a_vap,
> %                     ap->a_cred, ap->a_td);
> %         if ((error == 0) && (ap->a_vap->va_size != VNOVAL))
> %             union_newsize(ap->a_vp, ap->a_vap->va_size, VNOVAL);
> %         union_unlock_upper(uppervp, td);
> %     }
> %     return (error);
> % }
>
> The second clause in the comment doesn't match the code here -- the  
> error
> is very rarely EROFS since it is usually the error returned by  
> VOP_SETATTR()
> and we have avoided getting this far in most cases where VOP_SETATTR()
> would return EROFS.  The comment was closer to matching the code in  
> rev.1.1
> where decideding EROFS was left to lower layers.
>
> Bruce

The vnode locking is also wrong for several cases:
- According sys/kern/vnode_if.src we should return from VOP_OPEN()  
with vp locked, but union_open() unlocks it before returning it.
- Same thing in union_access().
- We are not making sure we are calling VOP_CLOSE() with the vnode  
exclusively locked in union_close().
- We don't lock the upper and lower vnodes when doing VOP_GETATTR()  
on them in union_getattr().
- We don't lock othervp when doing VOP_STRATEGY() in union_strategy().

I think there are also other problems, but haven't had time to look  
for them yet.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D188AED6-81E9-4EAD-8EB5-991489EB99BD>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation