Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 3 Mar 2018 12:46:10 +0000
From:      Alexander Richardson <Alexander.Richardson@cl.cam.ac.uk>
Cc:        "<cl-capsicum-discuss@lists.cam.ac.uk>" <cl-capsicum-discuss@lists.cam.ac.uk>, freebsd-hackers@freebsd.org
Subject:   Re: [capsicum] unlinkfd
Message-ID:  <CAEeofcgLD%2BTjKswPexNDUfeeAxHgUOjsZUdD3g3Jc%2BQuyRu4OQ@mail.gmail.com>
In-Reply-To: <17DE0BFF-42A2-4CD7-B09C-ABA2606C4041@cl.cam.ac.uk>
References:  <20180302183514.GA99279@x-wing> <CAK4o1Wyk54chHobhUkb2PBUtaWOF2rDv6tkX_bFGY6D331xUqw@mail.gmail.com> <17DE0BFF-42A2-4CD7-B09C-ABA2606C4041@cl.cam.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
Linux has a unlinkat() system call (https://linux.die.net/man/2/unlinkat)
but it doesn't seem to have a flag that lets you unlink the fd itself.
Possibly pathname =3D=3D NULL and AT_EMPTY_PATH could mean unlink the fd bu=
t I
haven't tried whether that works.
It also has a AT_REMOVEDIR flag to make it function as rmdirat().

On 3 March 2018 at 10:41, Robert N. M. Watson <robert.watson@cl.cam.ac.uk>
wrote:

> FWIW, this is part of why we introduced anonymous POSIX shared memory
> objects with Capsicum in FreeBSD -- we allow shm_open(2) to be passed a
> SHM_ANON special name, which causes the creation of a swap-backed, mappab=
le
> file-like object that can have I/O, memory mapping, etc, performed on it =
..
> but never has any persistent state across reboots even in the event of a
> crash.
>
> With Capsicum you can then refine a file descriptor to the otherwise
> writable object to be read-only for the purposes of delegation. There is
> not, however, a mechanism to "freeze" the state of the object causing oth=
er
> outstanding writable descriptors to become read-only -- certainly somethi=
ng
> could be added, but some care regarding VM semantics would be required --
> in particular, so that faults could not be experienced as a result of an
> memory store performed before the "freeze" but issued to VFS only later.
>
> I certainly have no objection to an unlinkat(2) system call -- it's
> unfortunate that a full suite of the at(2) APIs wasn't introduced in the
> first place. It would be worth checking that no one else (e.g., Solaris,
> Mac OS X, Linux) hasn't already added an unlinkat(2) that we can match AP=
I
> semantics for. I think I take the view that for truly anonymous objects,
> shm_open(2) without a name (or the Linux equiv) is the right thing -- and
> hence unlinkat(2) is for more conventional use cases where the final
> pathname element is known.
>
> On directories: There, I find myself falling back on a Casper-like
> service, since GC'ing a single anonymous memory object is straightforward=
,
> but GC'ing a directory hierarchy is a more messy business.
>
> Robert
>
> > On 3 Mar 2018, at 09:53, Justin Cormack <justin@specialbusservice.com>
> wrote:
> >
> > I think it would make sense to have an unlinkfd() that unlinks the file
> from
> > everywhere, so it does not need a name to be specified. This might be
> > hard to implement.
> >
> > For temporary files, I really like Linux memfd_create(2) that opens an
> anonymous
> > file without a name. This semantics is really useful. (Linux memfd also
> has
> > additional options for sealing the file fo make it immutable which are
> very
> > useful for safely passing files between processes.) Having a way to mak=
e
> > unnamed temporary files solves a lot of deletion issues as the file
> > never needs to
> > be unlinked.
> >
> >
> > On 2 March 2018 at 18:35, Mariusz Zaborski <oshogbo@freebsd.org> wrote:
> >> Hello,
> >>
> >> Today I would like to propose a new syscall called unlinkfd(2) which
> came up
> >> during a discussion with Ed Maste.
> >>
> >> Currently in UNIX we can=E2=80=99t remove files safely. If we will try=
 to do so
> we
> >> always end up in a race condition. For example when we open a file, an=
d
> check
> >> it with fstat, etc. then we want to unlink(2) it=E2=80=A6 but the file=
 we are
> trying to
> >> unlink could be a different one than the one we were fstating just a
> moment ago.
> >>
> >> Another reason of implementing unlinkfd(2) came to us when we were
> trying
> >> to sandbox some applications like: uudecode/b64decode or bspatch. It
> occured
> >> to us that we don=E2=80=99t have a good way of removing single files. =
Of course
> we can
> >> try to determine in which directory we are in, and then open this
> directory and
> >> remove a single file.
> >>
> >> It looks even more bizarre if we would think about a program which
> operates on
> >> multiple files. If we would analyze a situation with two totally
> different
> >> directories like `/tmp` and `/home/oshogbo` we would end up with pre
> opening
> >> a root directory or keeping as many directories as we are working on
> open.
> >> All of that effort only to remove two files. This make it totally
> impractical!
> >>
> >> I think that opening directories also presents some wider attack vecto=
r
> because
> >> we are keeping a single descriptor to a directory only to remove one
> file.
> >> Unfortunately this means that an attacker can remove all files in that
> directory.
> >>
> >> I proposed this as well on the last Capsicum call. There was a
> suggestion that
> >> instead of doing a single syscall maybe we should have a Casper servic=
e
> that
> >> will allow us to remove files. Another idea was that we should perhaps
> redesign
> >> programs to create some subdirs work on the subdirs and then remove al=
l
> files in
> >> this subdir. I don=E2=80=99t feel that creating a Casper service is a =
good idea
> because
> >> we still have exactly the same issue of race condition. In my opinion
> creating
> >> subdirs is also a problem for us.
> >>
> >> First we would need to redesign some of our tools and I think we shoul=
d
> >> simplyfiy capsicumizition of the process instead of making it harder.
> >>
> >> Secondly we can create a temporary subdirectory but what will remove i=
t?
> >> We are going back to having a fd to directory in which we just created
> a subdir.
> >> Another way would be to have Casper service which would remove a
> directory but
> >> with the risk of RC.
> >>
> >> In conclusion, I think we need syscall like unlinkfd(2), which turn ou=
t
> taht it
> >> is easy to implement. The only downside of this implementation is that
> we not
> >> only need to provide a fd but also a path file. This is because inodes
> nor
> >> vnodes don=E2=80=99t contain filenames. We are comparing vnodes of the=
 fd and
> the given
> >> path, if they are exactly the same we remove a file. In the syscall we
> are using
> >> a fd so there is no Ambient Authority because we are proving that we
> already
> >> have access to that file. Thanks to that the syscall can be safely use=
d
> with
> >> Caspsicum. I have already discussed this with some people and they sai=
d
> >> `Hey I already had that idea a while ago=E2=80=A6` so let=E2=80=99s do=
 something with
> that idea!
> >> If you are intereted in patch you can find it here:
> >> https://reviews.freebsd.org/D14567
> >>
> >> Thanks,
> >> --
> >> Mariusz Zaborski
> >> oshogbo//vx             | http://oshogbo.vexillium.org
> >> FreeBSD commiter        | https://freebsd.org
> >> Software developer      | http://wheelsystems.com
> >> If it's not broken, let's fix it till it is!!1
> >
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAEeofcgLD%2BTjKswPexNDUfeeAxHgUOjsZUdD3g3Jc%2BQuyRu4OQ>