From owner-freebsd-arch  Thu Dec 20 11: 0:15 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from rwcrmhc52.attbi.com (rwcrmhc52.attbi.com [216.148.227.88])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3109337B41E; Thu, 20 Dec 2001 11:00:12 -0800 (PST)
Received: from InterJet.elischer.org ([12.232.206.8])
          by rwcrmhc52.attbi.com
          (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP
          id <20011220190011.DGIW6450.rwcrmhc52.attbi.com@InterJet.elischer.org>;
          Thu, 20 Dec 2001 19:00:11 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id KAA52167;
	Thu, 20 Dec 2001 10:51:53 -0800 (PST)
Date: Thu, 20 Dec 2001 10:51:52 -0800 (PST)
From: Julian Elischer <julian@elischer.org>
To: Poul-Henning Kamp <phk@freebsd.org>
Cc: arch@freebsd.org
Subject: Re: Kernel stack size and stacking: do we have a problem ?
In-Reply-To: <600.1008837822@critter.freebsd.dk>
Message-ID: <Pine.BSF.4.21.0112201048390.46573-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=X-UNKNOWN
Content-Transfer-Encoding: QUOTED-PRINTABLE
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Netgraph has a bounding scheme that archie and I came up with, but
it has not been committed yet. basically, in the -current version,
the mbufs are passed with an itteration counter, and=20
if you directly execute another module you increment it. If you queue the
item you clear it to 0. After it reaches a limit of N the subsystem will=20
queue it rather than try run the next layer directly.
I have code to do that here and I've been thinking about checking it in..


On Thu, 20 Dec 2001, Poul-Henning Kamp wrote:

>=20
> As most of you have probably heard, I'm working on a stacking
> disk I/O layer (http://freefall.freebsd.org/~phk/Geom).
>=20
> This is as far as I know, only the third freely stackable subsystem
> in the kernel, the first two being VFS/filesystems and netgraph.
>=20
> The problem with stacking layered systems is that the na=EFve and
> simple implementation, just calling into the layer below, has
> basically unbounded kernel stack usage.
>=20
> Fortunately for us, neither VFS nor netgraph has had too much use
> yet, so we have not been excessively bothered by people running
> out of kernel-stack.
>=20
> It is well documented how to avoid the unbounded stack usage for
> such setups: simply queue the requests at each "gadget" and run
> a scheduler but this no where near as simple nor as fast as the
> direct call.
>=20
> So I guess we need to ask our selves the following questions:
>=20
> 1. What do we do when people start to run out of kernel stack
>    because they stack filesystems ?
> =09a) Tell them not to.
> =09b) Tell them to increase UPAGES.
> =09c) Increase default UPAGES.
> =09d) Redesign VFS/VOP to avoid the problem.
>=20
> 2. Do we in general want to incur the overhead of scheduling
>    in stacking layers or does increasing the kernel stack as
>    needed make more sense ?
>=20
> 3. Would it be possible to make kernel stack size a sysctl ?
>=20
> 4. Would it make sense to build an intelligent kernel-stack
>    overflow handling into the kernel, rather than "handling"
>    this with a panic ?
>=20
>    It should be trivially simple to make a function called
>    enough_stack() which would return false if we were in the
>    danger zone.  This function could then be used to fail
>    intelligently at strategic high-risk points in the kernel:
>=20
> =09int
> =09somefunction(...)
> =09{
> =09=09...
>=20
> =09=09if (!enough_stack())
> =09=09=09return (ENOMEM);
> =09=09...
> =09}
>=20
> Think about it...
>=20
> --=20
> Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG         | TCP/IP since RFC 956
> FreeBSD committer       | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetenc=
e.
>=20
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-arch" in the body of the message
>=20


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message