From owner-freebsd-arch@FreeBSD.ORG  Mon Dec 13 11:32:03 2004
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 339F916A4CE; Mon, 13 Dec 2004 11:32:03 +0000 (GMT)
Received: from moutng.kundenserver.de (moutng.kundenserver.de
	[212.227.126.191])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 84E7043D39; Mon, 13 Dec 2004 11:32:02 +0000 (GMT)
	(envelope-from max@love2party.net)
Received: from [212.227.126.208] (helo=mrelayng.kundenserver.de)
	by moutng.kundenserver.de with esmtp (Exim 3.35 #1)
	id 1CdoQr-0000EZ-00; Mon, 13 Dec 2004 12:32:01 +0100
Received: from [217.83.4.147] (helo=donor.laier.local)
	by mrelayng.kundenserver.de with asmtp (TLSv1:RC4-MD5:128)
	(Exim 3.35 #1)
	id 1CdoQr-00018g-00; Mon, 13 Dec 2004 12:32:01 +0100
From: Max Laier <max@love2party.net>
To: freebsd-current@freebsd.org
Date: Mon, 13 Dec 2004 12:32:47 +0100
User-Agent: KMail/1.7.1
References: <94AE3F5A-4CD6-11D9-8BD6-000A95C4D7BC@FreeBSD.org>
	<200412130913.20215.max@love2party.net>
In-Reply-To: <200412130913.20215.max@love2party.net>
MIME-Version: 1.0
Content-Type: multipart/signed;
  boundary="nextPart1147140.QXZ1z243kW";
  protocol="application/pgp-signature";
  micalg=pgp-sha1
Content-Transfer-Encoding: 7bit
Message-Id: <200412131232.55051.max@love2party.net>
X-Provags-ID: kundenserver.de abuse@kundenserver.de
	auth:61c499deaeeba3ba5be80f48ecc83056
cc: Suleiman Souhlal <ssouhlal@freebsd.org>
cc: freebsd-arch@freebsd.org
Subject: Re: sysctl locking
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Dec 2004 11:32:03 -0000

--nextPart1147140.QXZ1z243kW
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Monday 13 December 2004 09:13, I wrote:
> On Monday 13 December 2004 08:14, Suleiman Souhlal wrote:
> > Hello,
> >
> > The patch at http://people.freebsd.org/~ssouhlal/sysctl-locking.diff
> > removes Giant from the sysctl subsystem.
> > I tested it on i386 and powerpc, where it appears to work perfectly.
> > However, I have not been able to test it on an SMP box, as I don't have
> > access to any.
> > So I would appreciate it if someone would test it, and report any
> > problems.
> > I will commit it in about week, unless, of course, there are
> > objections/problems.
>
> Wait a minute ... you can't just assert that all sysctl handler are MPSAF=
E.
> It's a good idea to introduce "real" locking for the sysctl-tree handling
> in order to be able to lose Giant at a later point, but I *strongly*
> suggest that you keep on grabing Giant before calling oid_handler() in
> sysctl_root(). It doesn't seem like you do so, right now.
>
> You have identified two places where Giant is explicitly asserted, but I =
am
> afraid that there are much more handlers that don't assert Giant but need
> it. Moreover, the "simple" handler might also write to memory that is
> implicitly protected by Giant and should not be modified without it.
>
> As a transition step I suggest that we extend the API in the way the
> callout API works. So that you can ask for a Giant-free handler call by
> setting an MPSAFE flag.

This got me wondering a bit how well we protect sysctls at the moment or if=
 we=20
have code that just assumes atomicy for sysctls. As an example of a very we=
ll=20
protected sysctl there is kern.securelevel, which has it's own handler that=
=20
uses it's own mutex to protect the change and a defined API to read the val=
ue=20
(useing the same mutex to protect the access). Other values are not so well=
=20
protected.

It is safe for some sysctls - with just on/off semantic (net.inet.forwardin=
g=20
e.g.) - to read them unprotected. As the user can decide to flip the switch=
=20
in a loop, the code should cope with a (short) unstable value anyway.

Sysctls that describe maximum buffer sizes, portranges or maximal number of=
=20
threads per process are more dangerous and might need attention. With bad=20
timing we might have very undesired effects.

The "complex" sysctls that implement their own handlers are not a concern, =
as=20
they are usually implemented within the .c file that has the appropriate=20
mutex and can use proper locking if required.

If we come to the conclusion that it is required to protect these values=20
better, I suggest the following:

1) Extend sysctl_add_oid() to accept an additional mutex argument.
2) Extend the simple sysctl handler to use this mutex to protect the actual=
=20
   write(?read?). We must not hold the mutex during the useland copy in/out=
 so=20
   we must move to temporary storage.
3) To maintain the current API and behavior we use &Giant as the default=20
   fallback argument. This might need some extension for complex handler (i=
=2Ee.=20
   no mutex given -> acquire Giant before calling the complex handler).

What do people think of this? Does it make any sense? Should we be concerne=
d=20
at all? Does the extension make sense? Comments?

[CCing -arch]

=2D-=20
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

--nextPart1147140.QXZ1z243kW
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (FreeBSD)

iD8DBQBBvX3nXyyEoT62BG0RAlAIAJ9roEIb8kMmiDcHxhHtlDfZCHp4zQCcDhtC
kfQ6MAunbImQMOfBuqI+1eQ=
=RnXa
-----END PGP SIGNATURE-----

--nextPart1147140.QXZ1z243kW--