Date: Sat, 22 Jul 2006 18:16:31 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: Robert Watson <rwatson@freebsd.org> Cc: freebsd-arch@freebsd.org Subject: Re: mlock(2) for ordinary users Message-ID: <20060722151631.GB1217@deviant.kiev.zoral.com.ua> In-Reply-To: <20060722154606.N54846@fledge.watson.org> References: <20060721104044.GB728@turion.vk2pj.dyndns.org> <20060722154606.N54846@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Y5rl02BVI9TCfPar Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jul 22, 2006 at 03:52:37PM +0100, Robert Watson wrote: >=20 > On Fri, 21 Jul 2006, Peter Jeremy wrote: >=20 > >Currently mlock() and munlock() are restricted to the root user - which= =20 > >prevents an ordinary user locking their process into RAM to the detrimen= t=20 > >of the system as a whole. Whilst this is a valid concern, there are goo= d=20 > >security reasons for allowing a user to lock small amounts of memory (a= =20 > >few pages) to ensure that sensitive information (private keys, passwords= =20 > >etc) don't wind up on swap devices. > > > >There is a resource limit for locked pages (RLIMIT_MEMLOCK) and, despite= =20 > >the man page, a quick look at the code implies that it really is honoure= d.=20 > >Could someone with more VM-foo please confirm whether the last line of t= he=20 > >man page is still correct. > > > >I would like to suggest that the suser() tests in mlock() and munlock() = be=20 > >removed and the default RLIMIT_MEMLOCK is reduced from infinity to (say)= =20 > >1. The only gotcha I can see is that lots of sysctl() functions use=20 > >RLIMIT_MEMLOCK via sysctl_wire_old_buffer() and vslock(). >=20 > I think I'd like to see the functionality you suggest -- i.e., the abilit= y=20 > to allocate pinned memory pages to unprivileged processes. However, I ha= ve=20 > to wonder about whether this isn't already enabled for a reason -- in=20 > particular, I have to wonder if it works at all. The whole idea of=20 > resources limits is that you bill new use to a credential, and credit=20 > reduced use to a similar credential. Probably, we're interested only in= =20 > memory pinned at the request of the process, not memory pinned by the=20 > kernel on its behalf. The normal questions I'd try to answer about wheth= er=20 > it works currently are: >=20 > - When pages become locked on behalf of a credential, is it correctly bil= led > to the credential? >=20 > - When pages become unlocked (or are released), are any credentials that= =20 > have > requested it be locked credited? >=20 > - What happens when the credential on a process changes between when memo= ry > is locked and unlocked? >=20 > - What happens if more than one credential requests the same page of memo= ry=20 > be > locked and unlocked? >=20 > - Is locked memory properly credited back to the credential on process ex= it > and other non-explicit unmapping points? >=20 > Note in particular that more than one credential can request that the sam= e=20 > page be locked -- if two processes map the same page from a file, or one = is=20 > a fork of the other and has inheritted a shared mapping, we need to handl= e=20 > that "correctly". And we need to handle cases like setuid -- as with oth= er=20 > resource limit implementations, the right credential needs to be credited= .=20 > In the case of socket limits, for example, we actually keep a reference t= o=20 > the allocating credential in the struct socket so that when the socket is= =20 > freed, we can credit the resources back to the original credential, not t= o=20 > the credential of whatever process last references the socket. Presumabl= y=20 > something similar would be required here, and a quick glance doesn't=20 > suggest this is implemented. As far as I remember, RLIMIT_MEMLOCK is per-process instead of per-cred. As consequence, allowing mlock() for non-root users actually allow such user to allocate value-of(RLIMIT_MEMLOCK) * value-of(RLIMIT_NPROC). In fact, I had to make the answers to the asked questions when I implemented the per-user swap limits. The design I ended with was to add reference to the originating cred to vm_map_entry and vm_object (with somewhat complicated logic to move the ref from entry to object on occasion). --Y5rl02BVI9TCfPar Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (FreeBSD) iD8DBQFEwkFPC3+MBN1Mb4gRAu4wAKC+6iWNL134WWw23h2uM7KRvoYkswCg0Iba lwwD2xrWL+fMa+9FN3vGQQE= =j3sB -----END PGP SIGNATURE----- --Y5rl02BVI9TCfPar--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060722151631.GB1217>