Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 May 2009 18:01:18 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        jt@0xabadba.be
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: concurrent sysctl implementation
Message-ID:  <200905111801.18767.jhb@freebsd.org>
In-Reply-To: <a0806f900905111127p378628bbw89e1d45f087e558e@mail.gmail.com>
References:  <a0806f900905050107u4cbf0624oc83aafa54ae651f0@mail.gmail.com> <200905111224.26856.jhb@freebsd.org> <a0806f900905111127p378628bbw89e1d45f087e558e@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 11 May 2009 2:27:48 pm jt@0xabadba.be wrote:
> John,
>=20
>      Thank you for your input on this matter, I'm excited to write
> some software for this project since its given me great code to learn
> from as i've grown up (still a kid though :).  My questions are a bit
> more detailed below.
>=20
> On Mon, May 11, 2009 at 12:24 PM, John Baldwin <jhb@freebsd.org> wrote:
> > On Friday 08 May 2009 5:41:17 pm Ed Schouten wrote:
> >> A solution would be to solve it as follows:
> >>
> >> - Use a semaphore, initialized to some insane high value to put an upp=
er
> >> =A0 limit on the amount of concurrent sysctl calls. I'm not sure wheth=
er
> >> =A0 this is really needed. Maybe this issue is not as serious as we th=
ink
> >> =A0 it is.
> >
> > Well, one compromise might be to allow concurrent userland requests if =
the
> > buffer size is small (say < 1 page). =A0This would be a quite simple ch=
ange=20
and
> > would cover many common syscalls like fetching an int which don't wire=
=20
memory
> > anyway.
>=20
> Why is this a compromise?  Isn't concurrent sysctl calls from userland
> a good thing?  What materials would be good to read other than the
> code and the sysctl manual pages?  You said it would be relatively
> easy to implement this; what methods should I be considering to do
> this in and what part of the code specifically should I be looking at?

Well, in theory a bunch of "small" requests to SYSCTL_PROC() nodes that use=
d=20
sysctl_wire_old() (or whatever it is called) could cause the amount of user=
=20
memory wired for sysctls to grow unbounded.  Thus, allowing this limited=20
concurrency is a tradeoff as there is a minimal (perhaps only theoretical a=
t=20
the moment) risk of removing the safety net.

The patch is quite small, btw, because the locking for the sysctl tree alre=
ady=20
exists, and by using read locks, one can already allow concurrent sysctl=20
requests.  There is no need to add any new locks or restructure the sysctl=
=20
tree, just to adjust the locking that is already present.  It might be=20
clearer, in fact, to split the sysctl memory lock back out into a separate=
=20
lock.  This would allow "small" sysctl requests to run concurrently with a=
=20
single "large" request whereas in my suggestion in the earlier e-mail,=20
the "large" request will block all other user requests until it finishes.

I've actually gone ahead and done this below.

=2D-- //depot/projects/smpng/sys/kern/kern_sysctl.c	2009/05/08 11:53:25
+++ //depot/user/jhb/lock/kern/kern_sysctl.c	2009/05/11 21:58:08
@@ -77,11 +77,12 @@
  * API rather than using the dynamic API.  Use of the dynamic API is
  * strongly encouraged for most code.
  *
=2D * This lock is also used to serialize userland sysctl requests.  Some
=2D * sysctls wire user memory, and serializing the requests limits the
=2D * amount of wired user memory in use.
+ * The sysctlmemlock is used to limit the amount of user memory wired for
+ * sysctl requests.  This is implemented by serializing any userland
+ * sysctl requests larger than a single page via an exclusive lock.
  */
 static struct sx sysctllock;
+static struct sx sysctlmemlock;
=20
 #define	SYSCTL_SLOCK()		sx_slock(&sysctllock)
 #define	SYSCTL_SUNLOCK()	sx_sunlock(&sysctllock)
@@ -543,6 +544,7 @@
 {
 	struct sysctl_oid **oidp;
=20
+	sx_init(&sysctlmemlock, "sysctl mem");
 	SYSCTL_INIT();
 	SYSCTL_XLOCK();
 	SET_FOREACH(oidp, sysctl_set)
@@ -1563,7 +1565,7 @@
     size_t *oldlenp, int inkernel, void *new, size_t newlen, size_t *retva=
l,
     int flags)
 {
=2D	int error =3D 0;
+	int error =3D 0, memlocked;
 	struct sysctl_req req;
=20
 	bzero(&req, sizeof req);
@@ -1603,14 +1605,20 @@
 	if (KTRPOINT(curthread, KTR_SYSCTL))
 		ktrsysctl(name, namelen);
 #endif
=2D=09
=2D	SYSCTL_XLOCK();
+
+	if (req.oldlen > PAGE_SIZE) {
+		memlocked =3D 1;
+		sx_xlock(&sysctlmemlock);
+	} else
+		memlocked =3D 0;
 	CURVNET_SET(TD_TO_VNET(curthread));
=20
 	for (;;) {
 		req.oldidx =3D 0;
 		req.newidx =3D 0;
+		SYSCTL_SLOCK();
 		error =3D sysctl_root(0, name, namelen, &req);
+		SYSCTL_SUNLOCK();
 		if (error !=3D EAGAIN)
 			break;
 		uio_yield();
@@ -1620,7 +1628,8 @@
=20
 	if (req.lock =3D=3D REQ_WIRED && req.validlen > 0)
 		vsunlock(req.oldptr, req.validlen);
=2D	SYSCTL_XUNLOCK();
+	if (memlocked)
+		sx_xunlock(&sysctlmemlock);
=20
 	if (error && error !=3D ENOMEM)
 		return (error);

=2D-=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905111801.18767.jhb>