From owner-freebsd-arch@freebsd.org Sun May 22 06:40:50 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 33D47B43F0E; Sun, 22 May 2016 06:40:50 +0000 (UTC) (envelope-from mckusick@chez.mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:d250:99ff:fe57:4030]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "chez.mckusick.com", Issuer "chez.mckusick.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 033D21D1B; Sun, 22 May 2016 06:40:49 +0000 (UTC) (envelope-from mckusick@chez.mckusick.com) Received: from chez.mckusick.com (localhost [IPv6:::1]) by chez.mckusick.com (8.15.2/8.15.2) with ESMTP id u4M6enEo017327; Sat, 21 May 2016 23:40:49 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201605220640.u4M6enEo017327@chez.mckusick.com> From: Kirk McKusick To: Andriy Gapon Subject: Re: mount / unmount and mountcheckdirs() cc: freebsd-arch@FreeBSD.org, freebsd-fs In-reply-to: <5c01bf62-b7b2-2e1d-bca5-859e6bf1f0e5@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <17325.1463899249.1@chez.mckusick.com> Content-Transfer-Encoding: quoted-printable Date: Sat, 21 May 2016 23:40:49 -0700 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 May 2016 06:40:50 -0000 > To: freebsd-arch@FreeBSD.org, freebsd-fs > From: Andriy Gapon > Subject: mount / unmount and mountcheckdirs() > Date: Sun, 15 May 2016 16:37:05 +0300 > = > I am curious about the purpose of mountcheckdirs() called when mounting = and > unmounting a filesystem. > = > The function is described as such: > /* > * Scan all active processes and prisons to see if any of them have a cu= rrent > * or root directory of `olddp'. If so, replace them with the new mount = point. > */ > and it seems to be used to "lift" processes and jails to a root of a new > filesystem when it is mounted and to "lower" them onto a covered vnode (= if any) > when a filesystem is unmounted. > = > What's the purpose of those actions? > It's strange that the machinations are done at all, but it is stranger t= hat they > are applied only to processes and jails at exactly a covered vnode and a= root > vnode. Anything below in a filesystem's tree is left alone. Is there a= nything > so very special about being at exactly those points? > = > IMO, the machinations can have unexpected security consequences. > = > A little bit of history. > mountcheckdirs() was first added in r22521 (circa 1997) as checkdirs wit= h a > rather non-specific commit message. Initially it was used only when a > filesystem was mounted. > Then, in r73241 (circa 2002) the function was added to dounmount(): > The checkdirs() function is called at mount time to find any process > fd_cdir or fd_rdir pointers referencing the covered mountpoint > vnode. It transfers these to point at the root of the new filesystem= . > However, this process was not reversed at unmount time, so processes > with a cwd/root at a mount point would unexpectedly lose their > cwd/root following a mount-unmount cycle at that mountpoint. > ... > Dounmount() now undoes the actions > taken by checkdirs() at mount time; any process cdir/rdir pointers > that reference the root vnode of the unmounted filesystem are > transferred to the now-uncovered vnode. > = > = > -- = > Andriy Gapon I added the checkdirs functionality in the mount direction only (I actually did it in 4.4BSD-Lite and it got swept in with commit 22521). The reason is that when a directory that is not empty is mounted on, the expectation is that the entries in that directory should no longer be present; rather they should be replaced by the entries in the newly mounted directory. Thus all processes sitting in the mounted on directory should see the newly mounted directory as if they had come to it using a lookup after the mount had been done. If a process had proceeded through the mounted on directory into one of its other entries, then they are left alone until such time as they chdir back into the mount point directory through ".." at which time they will be passed up to the mounted directory using the same mechanism that would put them there if they traversed into the mount point from above it in the tree. I believe this is the correct behavior, is not a security threat, and should be left alone. I was not aware that the functionality had been added at unmount time, and I do not believe that it should have been done. Normally an unmount will not succeed if any vnodes are busy (for example, if any directory in the filesystem is a current directory). The only way that it can succeed in such a case is if a forcible unmount is done. The forcible unmount will effectively do a revoke(2) on all current directory vnodes in the unmounted filesystem. Further attempts to access them will fail with "." not found errors. The only way to get a valid current directory is to chdir to an absolute pathname. Gratuitously fixing this if you happen to be in the former root of the filesystem is wrong. And as you note can lead to unintensionally giving an escape path from a prison. So I concur with your removing this added functionality. Kirk McKusick that it can succeed if any