Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 07 Jun 2012 01:48:16 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        freebsd-arch@FreeBSD.org
Cc:        d@delphij.net
Subject:   Re: Allow small amount of memory be mlock()'ed by unprivileged process?
Message-ID:  <4FCFDE30.4020109@FreeBSD.org>
In-Reply-To: <4FC9F94B.8060708@FreeBSD.org>
References:  <4FAC3EAB.6050303@delphij.net> <861umkurt8.fsf@ds4.des.no> <CAJ-VmokY%2Bpgcq999NHShbq-3rK3=oeWT2WY7NmTvVdXOHZJhdg@mail.gmail.com> <CAF6rxgmDW21aPJ5Mp6Tbk1z02ivw4UPhSaNEX%2BWiu7O0v13skA@mail.gmail.com> <20120517055425.GA802@infradead.org> <4FC762DD.90101@FreeBSD.org> <4FC81D9C.2080801@FreeBSD.org> <4FC8E29F.2010806@shatow.net> <4FC95A10.7000806@freebsd.org> <4FC9F94B.8060708@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 02/06/2012 14:30 Andriy Gapon said the following:
[snip]
> Some further technical observations:
> o  I was overly optimistic about _full_ support for RLIMIT_MEMLOCK - mlockall()
> doesn't support itat the moment and I am not sure if it is easy to implement the
> support for the MCL_FUTURE case.
> 
> o  Currently the default class in default login.conf has memorylocked=unlimited
> - not very smart.
> 
> o  There is also vm.max_wired sysctl (with no equivalent tunable), which
> specifies number of _pages_ that can be wired system wide (by both kernel and
> userland).  But note that the limit applies only to userland requests, the
> kernel is allowed to wire new pages even when the limit is exceeded.  By default
> the limit is set to 1/3 of available pages.
> So watch out for this limit when using ZFS, ZFS can easily starve userland.
> 
> o  I've just discovered :-) that we also have RCTL/RACCT framework (not enabled
> by default) aka "Resource Accounting" / "Resource Limits", which seems to
> parallel the conventional limits in many categories including the locked memory.
>  Not sure why we have that and if the interactions between conventional limits,
> resource limits and privileges would be easy to untangle.
[snip]

In case someone still follows this thread, here is another observation.
While non-privileged users can not explicitly wire/lock memory for their private
use, they are still subject to RLIMIT_MEMLOCK accounting.
E.g. sysctl system call may temporarily wire userspace buffers and that wiring
is checked against the RLIMIT_MEMLOCK limit.  And some sysctl calls may require
quite large buffer sizes, e.g. OIDs under kern.proc when used by e.g. fstat.
I observed the cases when the sysctl wired more than 128KB of memory.  I think
that on larger/busier systems it could be even more.

So, on one hand this vslock-against-RLIMIT_MEMLOCK check is good because it
protects against resource starvation via abuse.
On the other hand, I am not sure if this is a proper use of RLIMIT_MEMLOCK.
After all, vslock-ing by e.g. sysctl is an implementation detail.  The memory is
wired because of how kernel does things, not because a user/process wants to
wire that memory.  Besides the wiring is temporary.  So I am not sure that it is
fair to charge that kind of memory wiring to userland.

In any case, beware that if you decide to lower "locked-in-memory size" limit
(RLIMIT_MEMLOCK), then some sysctls and the tools using them (like fstat) may
start failing.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4FCFDE30.4020109>