From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 22:48:21 2012 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 17CFB106566C for ; Wed, 6 Jun 2012 22:48:21 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 42ADF8FC0A for ; Wed, 6 Jun 2012 22:48:20 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id BAA11256; Thu, 07 Jun 2012 01:48:18 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1ScP1V-0009SO-WE; Thu, 07 Jun 2012 01:48:18 +0300 Message-ID: <4FCFDE30.4020109@FreeBSD.org> Date: Thu, 07 Jun 2012 01:48:16 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:12.0) Gecko/20120503 Thunderbird/12.0.1 MIME-Version: 1.0 To: freebsd-arch@FreeBSD.org References: <4FAC3EAB.6050303@delphij.net> <861umkurt8.fsf@ds4.des.no> <20120517055425.GA802@infradead.org> <4FC762DD.90101@FreeBSD.org> <4FC81D9C.2080801@FreeBSD.org> <4FC8E29F.2010806@shatow.net> <4FC95A10.7000806@freebsd.org> <4FC9F94B.8060708@FreeBSD.org> In-Reply-To: <4FC9F94B.8060708@FreeBSD.org> X-Enigmail-Version: 1.5pre Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: d@delphij.net Subject: Re: Allow small amount of memory be mlock()'ed by unprivileged process? X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2012 22:48:21 -0000 on 02/06/2012 14:30 Andriy Gapon said the following: [snip] > Some further technical observations: > o I was overly optimistic about _full_ support for RLIMIT_MEMLOCK - mlockall() > doesn't support itat the moment and I am not sure if it is easy to implement the > support for the MCL_FUTURE case. > > o Currently the default class in default login.conf has memorylocked=unlimited > - not very smart. > > o There is also vm.max_wired sysctl (with no equivalent tunable), which > specifies number of _pages_ that can be wired system wide (by both kernel and > userland). But note that the limit applies only to userland requests, the > kernel is allowed to wire new pages even when the limit is exceeded. By default > the limit is set to 1/3 of available pages. > So watch out for this limit when using ZFS, ZFS can easily starve userland. > > o I've just discovered :-) that we also have RCTL/RACCT framework (not enabled > by default) aka "Resource Accounting" / "Resource Limits", which seems to > parallel the conventional limits in many categories including the locked memory. > Not sure why we have that and if the interactions between conventional limits, > resource limits and privileges would be easy to untangle. [snip] In case someone still follows this thread, here is another observation. While non-privileged users can not explicitly wire/lock memory for their private use, they are still subject to RLIMIT_MEMLOCK accounting. E.g. sysctl system call may temporarily wire userspace buffers and that wiring is checked against the RLIMIT_MEMLOCK limit. And some sysctl calls may require quite large buffer sizes, e.g. OIDs under kern.proc when used by e.g. fstat. I observed the cases when the sysctl wired more than 128KB of memory. I think that on larger/busier systems it could be even more. So, on one hand this vslock-against-RLIMIT_MEMLOCK check is good because it protects against resource starvation via abuse. On the other hand, I am not sure if this is a proper use of RLIMIT_MEMLOCK. After all, vslock-ing by e.g. sysctl is an implementation detail. The memory is wired because of how kernel does things, not because a user/process wants to wire that memory. Besides the wiring is temporary. So I am not sure that it is fair to charge that kind of memory wiring to userland. In any case, beware that if you decide to lower "locked-in-memory size" limit (RLIMIT_MEMLOCK), then some sysctls and the tools using them (like fstat) may start failing. -- Andriy Gapon