From owner-freebsd-arch@FreeBSD.ORG Sat Jul 22 15:16:38 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A6BA716A4DA; Sat, 22 Jul 2006 15:16:38 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id ED93143D46; Sat, 22 Jul 2006 15:16:37 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k6MFGWc7068119 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 22 Jul 2006 18:16:32 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6) with ESMTP id k6MFGXVm025105; Sat, 22 Jul 2006 18:16:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6/Submit) id k6MFGVrx025104; Sat, 22 Jul 2006 18:16:31 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 22 Jul 2006 18:16:31 +0300 From: Kostik Belousov To: Robert Watson Message-ID: <20060722151631.GB1217@deviant.kiev.zoral.com.ua> References: <20060721104044.GB728@turion.vk2pj.dyndns.org> <20060722154606.N54846@fledge.watson.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Y5rl02BVI9TCfPar" Content-Disposition: inline In-Reply-To: <20060722154606.N54846@fledge.watson.org> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.2, clamav-milter version 0.88.2 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=0.4 required=5.0 tests=ALL_TRUSTED, DNS_FROM_RFC_ABUSE,SPF_NEUTRAL autolearn=no version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on fw.zoral.com.ua Cc: freebsd-arch@freebsd.org Subject: Re: mlock(2) for ordinary users X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Jul 2006 15:16:38 -0000 --Y5rl02BVI9TCfPar Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jul 22, 2006 at 03:52:37PM +0100, Robert Watson wrote: >=20 > On Fri, 21 Jul 2006, Peter Jeremy wrote: >=20 > >Currently mlock() and munlock() are restricted to the root user - which= =20 > >prevents an ordinary user locking their process into RAM to the detrimen= t=20 > >of the system as a whole. Whilst this is a valid concern, there are goo= d=20 > >security reasons for allowing a user to lock small amounts of memory (a= =20 > >few pages) to ensure that sensitive information (private keys, passwords= =20 > >etc) don't wind up on swap devices. > > > >There is a resource limit for locked pages (RLIMIT_MEMLOCK) and, despite= =20 > >the man page, a quick look at the code implies that it really is honoure= d.=20 > >Could someone with more VM-foo please confirm whether the last line of t= he=20 > >man page is still correct. > > > >I would like to suggest that the suser() tests in mlock() and munlock() = be=20 > >removed and the default RLIMIT_MEMLOCK is reduced from infinity to (say)= =20 > >1. The only gotcha I can see is that lots of sysctl() functions use=20 > >RLIMIT_MEMLOCK via sysctl_wire_old_buffer() and vslock(). >=20 > I think I'd like to see the functionality you suggest -- i.e., the abilit= y=20 > to allocate pinned memory pages to unprivileged processes. However, I ha= ve=20 > to wonder about whether this isn't already enabled for a reason -- in=20 > particular, I have to wonder if it works at all. The whole idea of=20 > resources limits is that you bill new use to a credential, and credit=20 > reduced use to a similar credential. Probably, we're interested only in= =20 > memory pinned at the request of the process, not memory pinned by the=20 > kernel on its behalf. The normal questions I'd try to answer about wheth= er=20 > it works currently are: >=20 > - When pages become locked on behalf of a credential, is it correctly bil= led > to the credential? >=20 > - When pages become unlocked (or are released), are any credentials that= =20 > have > requested it be locked credited? >=20 > - What happens when the credential on a process changes between when memo= ry > is locked and unlocked? >=20 > - What happens if more than one credential requests the same page of memo= ry=20 > be > locked and unlocked? >=20 > - Is locked memory properly credited back to the credential on process ex= it > and other non-explicit unmapping points? >=20 > Note in particular that more than one credential can request that the sam= e=20 > page be locked -- if two processes map the same page from a file, or one = is=20 > a fork of the other and has inheritted a shared mapping, we need to handl= e=20 > that "correctly". And we need to handle cases like setuid -- as with oth= er=20 > resource limit implementations, the right credential needs to be credited= .=20 > In the case of socket limits, for example, we actually keep a reference t= o=20 > the allocating credential in the struct socket so that when the socket is= =20 > freed, we can credit the resources back to the original credential, not t= o=20 > the credential of whatever process last references the socket. Presumabl= y=20 > something similar would be required here, and a quick glance doesn't=20 > suggest this is implemented. As far as I remember, RLIMIT_MEMLOCK is per-process instead of per-cred. As consequence, allowing mlock() for non-root users actually allow such user to allocate value-of(RLIMIT_MEMLOCK) * value-of(RLIMIT_NPROC). In fact, I had to make the answers to the asked questions when I implemented the per-user swap limits. The design I ended with was to add reference to the originating cred to vm_map_entry and vm_object (with somewhat complicated logic to move the ref from entry to object on occasion). --Y5rl02BVI9TCfPar Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (FreeBSD) iD8DBQFEwkFPC3+MBN1Mb4gRAu4wAKC+6iWNL134WWw23h2uM7KRvoYkswCg0Iba lwwD2xrWL+fMa+9FN3vGQQE= =j3sB -----END PGP SIGNATURE----- --Y5rl02BVI9TCfPar--