From owner-freebsd-arch@freebsd.org  Sun May 22 06:40:50 2016
Return-Path: <owner-freebsd-arch@freebsd.org>
Delivered-To: freebsd-arch@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 33D47B43F0E;
 Sun, 22 May 2016 06:40:50 +0000 (UTC)
 (envelope-from mckusick@chez.mckusick.com)
Received: from chez.mckusick.com (chez.mckusick.com
 [IPv6:2001:5a8:4:7e72:d250:99ff:fe57:4030])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "chez.mckusick.com", Issuer "chez.mckusick.com" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 033D21D1B;
 Sun, 22 May 2016 06:40:49 +0000 (UTC)
 (envelope-from mckusick@chez.mckusick.com)
Received: from chez.mckusick.com (localhost [IPv6:::1])
 by chez.mckusick.com (8.15.2/8.15.2) with ESMTP id u4M6enEo017327;
 Sat, 21 May 2016 23:40:49 -0700 (PDT)
 (envelope-from mckusick@chez.mckusick.com)
Message-Id: <201605220640.u4M6enEo017327@chez.mckusick.com>
From: Kirk McKusick <mckusick@chez.mckusick.com>
To: Andriy Gapon <avg@FreeBSD.org>
Subject: Re: mount / unmount and mountcheckdirs()
cc: freebsd-arch@FreeBSD.org, freebsd-fs <freebsd-fs@FreeBSD.org>
In-reply-to: <5c01bf62-b7b2-2e1d-bca5-859e6bf1f0e5@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <17325.1463899249.1@chez.mckusick.com>
Content-Transfer-Encoding: quoted-printable
Date: Sat, 21 May 2016 23:40:49 -0700
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch/>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 May 2016 06:40:50 -0000

> To: freebsd-arch@FreeBSD.org, freebsd-fs <freebsd-fs@FreeBSD.org>
> From: Andriy Gapon <avg@FreeBSD.org>
> Subject: mount / unmount and mountcheckdirs()
> Date: Sun, 15 May 2016 16:37:05 +0300
> =

> I am curious about the purpose of mountcheckdirs() called when mounting =
and
> unmounting a filesystem.
> =

> The function is described as such:
> /*
>  * Scan all active processes and prisons to see if any of them have a cu=
rrent
>  * or root directory of `olddp'. If so, replace them with the new mount =
point.
>  */
> and it seems to be used to "lift" processes and jails to a root of a new
> filesystem when it is mounted and to "lower" them onto a covered vnode (=
if any)
> when a filesystem is unmounted.
> =

> What's the purpose of those actions?
> It's strange that the machinations are done at all, but it is stranger t=
hat they
> are applied only to processes and jails at exactly a covered vnode and a=
 root
> vnode.  Anything below in a filesystem's tree is left alone.  Is there a=
nything
> so very special about being at exactly those points?
> =

> IMO, the machinations can have unexpected security consequences.
> =

> A little bit of history.
> mountcheckdirs() was first added in r22521 (circa 1997) as checkdirs wit=
h a
> rather non-specific commit message.  Initially it was used only when a
> filesystem was mounted.
> Then, in r73241 (circa 2002) the function was added to dounmount():
>     The checkdirs() function is called at mount time to find any process
>     fd_cdir or fd_rdir pointers referencing the covered mountpoint
>     vnode. It transfers these to point at the root of the new filesystem=
.
>     However, this process was not reversed at unmount time, so processes
>     with a cwd/root at a mount point would unexpectedly lose their
>     cwd/root following a mount-unmount cycle at that mountpoint.
> ...
>     Dounmount() now undoes the actions
>     taken by checkdirs() at mount time; any process cdir/rdir pointers
>     that reference the root vnode of the unmounted filesystem are
>     transferred to the now-uncovered vnode.
> =

> =

> -- =

> Andriy Gapon

I added the checkdirs functionality in the mount direction only
(I actually did it in 4.4BSD-Lite and it got swept in with commit
22521). The reason is that when a directory that is not empty is
mounted on, the expectation is that the entries in that directory
should no longer be present; rather they should be replaced by the
entries in the newly mounted directory. Thus all processes sitting
in the mounted on directory should see the newly mounted directory
as if they had come to it using a lookup after the mount had been
done. If a process had proceeded through the mounted on directory
into one of its other entries, then they are left alone until such
time as they chdir back into the mount point directory through ".."
at which time they will be passed up to the mounted directory using
the same mechanism that would put them there if they traversed into
the mount point from above it in the tree. I believe this is the
correct behavior, is not a security threat, and should be left alone.

I was not aware that the functionality had been added at unmount
time, and I do not believe that it should have been done. Normally
an unmount will not succeed if any vnodes are busy (for example, if
any directory in the filesystem is a current directory). The only
way that it can succeed in such a case is if a forcible unmount is
done. The forcible unmount will effectively do a revoke(2) on all
current directory vnodes in the unmounted filesystem. Further attempts
to access them will fail with "." not found errors. The only way to
get a valid current directory is to chdir to an absolute pathname.
Gratuitously fixing this if you happen to be in the former root of
the filesystem is wrong. And as you note can lead to unintensionally
giving an escape path from a prison. So I concur with your removing
this added functionality.

	Kirk McKusick
that it can succeed if any