Date: Mon, 11 May 2009 18:01:18 -0400 From: John Baldwin <jhb@freebsd.org> To: jt@0xabadba.be Cc: freebsd-hackers@freebsd.org Subject: Re: concurrent sysctl implementation Message-ID: <200905111801.18767.jhb@freebsd.org> In-Reply-To: <a0806f900905111127p378628bbw89e1d45f087e558e@mail.gmail.com> References: <a0806f900905050107u4cbf0624oc83aafa54ae651f0@mail.gmail.com> <200905111224.26856.jhb@freebsd.org> <a0806f900905111127p378628bbw89e1d45f087e558e@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 11 May 2009 2:27:48 pm jt@0xabadba.be wrote:
> John,
>=20
> Thank you for your input on this matter, I'm excited to write
> some software for this project since its given me great code to learn
> from as i've grown up (still a kid though :). My questions are a bit
> more detailed below.
>=20
> On Mon, May 11, 2009 at 12:24 PM, John Baldwin <jhb@freebsd.org> wrote:
> > On Friday 08 May 2009 5:41:17 pm Ed Schouten wrote:
> >> A solution would be to solve it as follows:
> >>
> >> - Use a semaphore, initialized to some insane high value to put an upp=
er
> >> =A0 limit on the amount of concurrent sysctl calls. I'm not sure wheth=
er
> >> =A0 this is really needed. Maybe this issue is not as serious as we th=
ink
> >> =A0 it is.
> >
> > Well, one compromise might be to allow concurrent userland requests if =
the
> > buffer size is small (say < 1 page). =A0This would be a quite simple ch=
ange=20
and
> > would cover many common syscalls like fetching an int which don't wire=
=20
memory
> > anyway.
>=20
> Why is this a compromise? Isn't concurrent sysctl calls from userland
> a good thing? What materials would be good to read other than the
> code and the sysctl manual pages? You said it would be relatively
> easy to implement this; what methods should I be considering to do
> this in and what part of the code specifically should I be looking at?
Well, in theory a bunch of "small" requests to SYSCTL_PROC() nodes that use=
d=20
sysctl_wire_old() (or whatever it is called) could cause the amount of user=
=20
memory wired for sysctls to grow unbounded. Thus, allowing this limited=20
concurrency is a tradeoff as there is a minimal (perhaps only theoretical a=
t=20
the moment) risk of removing the safety net.
The patch is quite small, btw, because the locking for the sysctl tree alre=
ady=20
exists, and by using read locks, one can already allow concurrent sysctl=20
requests. There is no need to add any new locks or restructure the sysctl=
=20
tree, just to adjust the locking that is already present. It might be=20
clearer, in fact, to split the sysctl memory lock back out into a separate=
=20
lock. This would allow "small" sysctl requests to run concurrently with a=
=20
single "large" request whereas in my suggestion in the earlier e-mail,=20
the "large" request will block all other user requests until it finishes.
I've actually gone ahead and done this below.
=2D-- //depot/projects/smpng/sys/kern/kern_sysctl.c 2009/05/08 11:53:25
+++ //depot/user/jhb/lock/kern/kern_sysctl.c 2009/05/11 21:58:08
@@ -77,11 +77,12 @@
* API rather than using the dynamic API. Use of the dynamic API is
* strongly encouraged for most code.
*
=2D * This lock is also used to serialize userland sysctl requests. Some
=2D * sysctls wire user memory, and serializing the requests limits the
=2D * amount of wired user memory in use.
+ * The sysctlmemlock is used to limit the amount of user memory wired for
+ * sysctl requests. This is implemented by serializing any userland
+ * sysctl requests larger than a single page via an exclusive lock.
*/
static struct sx sysctllock;
+static struct sx sysctlmemlock;
=20
#define SYSCTL_SLOCK() sx_slock(&sysctllock)
#define SYSCTL_SUNLOCK() sx_sunlock(&sysctllock)
@@ -543,6 +544,7 @@
{
struct sysctl_oid **oidp;
=20
+ sx_init(&sysctlmemlock, "sysctl mem");
SYSCTL_INIT();
SYSCTL_XLOCK();
SET_FOREACH(oidp, sysctl_set)
@@ -1563,7 +1565,7 @@
size_t *oldlenp, int inkernel, void *new, size_t newlen, size_t *retva=
l,
int flags)
{
=2D int error =3D 0;
+ int error =3D 0, memlocked;
struct sysctl_req req;
=20
bzero(&req, sizeof req);
@@ -1603,14 +1605,20 @@
if (KTRPOINT(curthread, KTR_SYSCTL))
ktrsysctl(name, namelen);
#endif
=2D=09
=2D SYSCTL_XLOCK();
+
+ if (req.oldlen > PAGE_SIZE) {
+ memlocked =3D 1;
+ sx_xlock(&sysctlmemlock);
+ } else
+ memlocked =3D 0;
CURVNET_SET(TD_TO_VNET(curthread));
=20
for (;;) {
req.oldidx =3D 0;
req.newidx =3D 0;
+ SYSCTL_SLOCK();
error =3D sysctl_root(0, name, namelen, &req);
+ SYSCTL_SUNLOCK();
if (error !=3D EAGAIN)
break;
uio_yield();
@@ -1620,7 +1628,8 @@
=20
if (req.lock =3D=3D REQ_WIRED && req.validlen > 0)
vsunlock(req.oldptr, req.validlen);
=2D SYSCTL_XUNLOCK();
+ if (memlocked)
+ sx_xunlock(&sysctlmemlock);
=20
if (error && error !=3D ENOMEM)
return (error);
=2D-=20
John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905111801.18767.jhb>
