Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Jul 2006 18:16:31 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: mlock(2) for ordinary users
Message-ID:  <20060722151631.GB1217@deviant.kiev.zoral.com.ua>
In-Reply-To: <20060722154606.N54846@fledge.watson.org>
References:  <20060721104044.GB728@turion.vk2pj.dyndns.org> <20060722154606.N54846@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Y5rl02BVI9TCfPar
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Jul 22, 2006 at 03:52:37PM +0100, Robert Watson wrote:
>=20
> On Fri, 21 Jul 2006, Peter Jeremy wrote:
>=20
> >Currently mlock() and munlock() are restricted to the root user - which=
=20
> >prevents an ordinary user locking their process into RAM to the detrimen=
t=20
> >of the system as a whole.  Whilst this is a valid concern, there are goo=
d=20
> >security reasons for allowing a user to lock small amounts of memory (a=
=20
> >few pages) to ensure that sensitive information (private keys, passwords=
=20
> >etc) don't wind up on swap devices.
> >
> >There is a resource limit for locked pages (RLIMIT_MEMLOCK) and, despite=
=20
> >the man page, a quick look at the code implies that it really is honoure=
d.=20
> >Could someone with more VM-foo please confirm whether the last line of t=
he=20
> >man page is still correct.
> >
> >I would like to suggest that the suser() tests in mlock() and munlock() =
be=20
> >removed and the default RLIMIT_MEMLOCK is reduced from infinity to (say)=
=20
> >1. The only gotcha I can see is that lots of sysctl() functions use=20
> >RLIMIT_MEMLOCK via sysctl_wire_old_buffer() and vslock().
>=20
> I think I'd like to see the functionality you suggest -- i.e., the abilit=
y=20
> to allocate pinned memory pages to unprivileged processes.  However, I ha=
ve=20
> to wonder about whether this isn't already enabled for a reason -- in=20
> particular, I have to wonder if it works at all.  The whole idea of=20
> resources limits is that you bill new use to a credential, and credit=20
> reduced use to a similar credential.  Probably, we're interested only in=
=20
> memory pinned at the request of the process, not memory pinned by the=20
> kernel on its behalf.  The normal questions I'd try to answer about wheth=
er=20
> it works currently are:
>=20
> - When pages become locked on behalf of a credential, is it correctly bil=
led
>   to the credential?
>=20
> - When pages become unlocked (or are released), are any credentials that=
=20
> have
>   requested it be locked credited?
>=20
> - What happens when the credential on a process changes between when memo=
ry
>   is locked and unlocked?
>=20
> - What happens if more than one credential requests the same page of memo=
ry=20
> be
>   locked and unlocked?
>=20
> - Is locked memory properly credited back to the credential on process ex=
it
>   and other non-explicit unmapping points?
>=20
> Note in particular that more than one credential can request that the sam=
e=20
> page be locked -- if two processes map the same page from a file, or one =
is=20
> a fork of the other and has inheritted a shared mapping, we need to handl=
e=20
> that "correctly".  And we need to handle cases like setuid -- as with oth=
er=20
> resource limit implementations, the right credential needs to be credited=
.=20
> In the case of socket limits, for example, we actually keep a reference t=
o=20
> the allocating credential in the struct socket so that when the socket is=
=20
> freed, we can credit the resources back to the original credential, not t=
o=20
> the credential of whatever process last references the socket.  Presumabl=
y=20
> something similar would be required here, and a quick glance doesn't=20
> suggest this is implemented.

As far as I remember, RLIMIT_MEMLOCK is per-process instead of per-cred.
As consequence, allowing mlock() for non-root users actually allow such
user to allocate value-of(RLIMIT_MEMLOCK) * value-of(RLIMIT_NPROC).

In fact, I had to make the answers to the asked questions when I
implemented the per-user swap limits. The design I ended with was to
add reference to the originating cred to vm_map_entry and vm_object
(with somewhat complicated logic to move the ref from entry to object
on occasion).

--Y5rl02BVI9TCfPar
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (FreeBSD)

iD8DBQFEwkFPC3+MBN1Mb4gRAu4wAKC+6iWNL134WWw23h2uM7KRvoYkswCg0Iba
lwwD2xrWL+fMa+9FN3vGQQE=
=j3sB
-----END PGP SIGNATURE-----

--Y5rl02BVI9TCfPar--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060722151631.GB1217>