From owner-freebsd-fs@FreeBSD.ORG Mon Jul 4 18:52:57 2005 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B9F8B16A41C; Mon, 4 Jul 2005 18:52:57 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.115]) by mx1.FreeBSD.org (Postfix) with ESMTP id 437E543D53; Mon, 4 Jul 2005 18:52:57 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87]) by mailout2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j64Iqtfp004118; Tue, 5 Jul 2005 04:52:55 +1000 Received: from epsplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (8.13.4/8.13.4/Debian-3) with ESMTP id j64IqqFs027041; Tue, 5 Jul 2005 04:52:53 +1000 Date: Tue, 5 Jul 2005 04:52:52 +1000 (EST) From: Bruce Evans X-X-Sender: bde@epsplex.bde.org To: Dario Freni In-Reply-To: <20050704095431.GA31030@cvs.freesbie.org> Message-ID: <20050705030015.N790@epsplex.bde.org> References: <20050703181616.GC89744@cvs.freesbie.org> <42C83643.4010506@samsco.org> <20050703201621.GD89744@cvs.freesbie.org> <1120425831.77984.37993.camel@palm> <42C87CAE.7080802@samsco.org> <1120436351.77984.38195.camel@palm> <42C88121.8010602@samsco.org> <1120437780.77984.38252.camel@palm> <20050704095431.GA31030@cvs.freesbie.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org, ken@freebsd.org, Stephan Uphoff Subject: Re: [saturnero@freesbie.org: Weird behaviour of mount_unionfs with executables] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2005 18:52:57 -0000 On Mon, 4 Jul 2005, Dario Freni wrote: > On Sun, Jul 03, 2005 at 08:43:01PM -0400, Stephan Uphoff wrote: >> YES - and it looks like just specifying noatime for the union mount >> should fix the copy problem for FreeSBIE. > > Correct. Using -o noatime when mounting the unionfs resolves all above > problems. Thank you very much. Hmm. This shouldn't work, but helps avoid the copy due to a layering violation. In kern_execve(), the test of MNT_NOATIME is supposed to be just an optimization to avoid making null VOP_SETATTR(), but it only looks at the mount flag for the top layer so it doesn't work right if the mount flag is different in lower layers. This gives the bizarre behaviour that the noatime option for unionsfs "works" for exec() but not for read(). I could only see the duplication problem for mount_unionfs -b. There are lots of bugs in that case: = no copy is made for the modification caused by read(), despite mount_unionfs(8) claiming that: Any other operation which would ultimately require modification to the ^^^ lower layer fails with EROFS. - the error for failure is often not EROFS (but this may be a side effect). - exec() makes a bad copy. It fails to copy the access times and doesn't even copy the permissions; in particular, it loses the x bits so the file becomes non-executable. - touch(1) makes a bad copy like exec() and then fails to actually touch the file except for the clobbered times in the non-copy being current. (touch tries utimes(2) twice, first with the current times and then with a null times pointer. Both calls fail with EACCES. I think a bad copy is made on the first call ...). Subsequent touch(1)'s work normally (... touch tries utimes(2) only once, only with the current times, and since this works there seems to be a race for the earlier calls to fail (I tested as root, so there should be no permissions problems with either call, and since the middle utimes(path, NULL) call failed the bug can't be just a stale vp after making the copy in the first call).) The comment in union_setattr() only claims to make a copy to handle truncation, but a copy seems to be made when any attribute is set: % static int % union_setattr(ap) % ... % /* % * Handle case of truncating lower object to zero size % * by creating a zero length upper object. This is to % * handle the case of open with O_TRUNC and O_CREAT. % */ % if (un->un_uppervp == NULLVP && (un->un_lowervp->v_type == VREG)) { % error = union_copyup(un, (ap->a_vap->va_size != 0), % ap->a_cred, ap->a_td); % if (error) % return (error); % } Copying on any change to an attribute seems right, but it doesn't actually handle implicit changes that don't go through setattr -- mainly setting of atimes on read(). Copying on read() wouldn't be right. % % /* % * Try to set attributes in upper layer, % * otherwise return read-only filesystem error. % */ % error = EROFS; % if ((uppervp = union_lock_upper(un, td)) != NULLVP) { % error = VOP_SETATTR(un->un_uppervp, ap->a_vap, % ap->a_cred, ap->a_td); % if ((error == 0) && (ap->a_vap->va_size != VNOVAL)) % union_newsize(ap->a_vp, ap->a_vap->va_size, VNOVAL); % union_unlock_upper(uppervp, td); % } % return (error); % } The second clause in the comment doesn't match the code here -- the error is very rarely EROFS since it is usually the error returned by VOP_SETATTR() and we have avoided getting this far in most cases where VOP_SETATTR() would return EROFS. The comment was closer to matching the code in rev.1.1 where decideding EROFS was left to lower layers. Bruce