From owner-freebsd-hackers  Thu Nov 18  9: 0:11 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id 9FADC15432
	for <freebsd-hackers@FreeBSD.ORG>; Thu, 18 Nov 1999 09:00:08 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id JAA85880;
	Thu, 18 Nov 1999 09:00:07 -0800 (PST)
	(envelope-from dillon)
Date: Thu, 18 Nov 1999 09:00:07 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199911181700.JAA85880@apollo.backplane.com>
To: Zach Brown <zab@zabbo.net>
Cc: freebsd-hackers@FreeBSD.ORG
Subject: Re: mbuf wait code (revisited) -- review? 
References:  <Pine.LNX.3.96.991118114107.30813W-100000@devserv.devel.redhat.com>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:> >(sysctl-ized) in FBSD (Some work have been done in Linux, since a
:> >well-known comparative benchmark offense). Would be even more usefull
:> >in SMP context.
:
:I don't think the wake-many problem was ever the cause of the poor numbers
:that comparitve benchmark unearthed.  This is only a problem if you have a
:whole slew of children sitting around waiting for new connections, rather
:than doing real work.  this sure isn't the environment a heavily loaded
:server is under :)  If you're still curious, check out
:
:http://www.kegel.com/mindcraft_redux.html
:
:specifically
:
:http://kernelnotes.org/lnxlists/linux-kernel/lk_9906_04/msg01100.html
:
:-- zach

    Well, the wake-many problem hit me several times at BEST both with
    Apache and with the WWW server I wrote.  We had the problem under both
    FreeBSD and IRIX.  These were heavily loaded web servers and the wakeup
    issue turned into an O(N^2) problem.  Every time a connection was
    accepted it woke up N processes where N effectively scaled to the 
    connection rate.  For example, shellx (the IRIX box) was getting 
    40 connections/second and had over 600 active connections at any given
    moment.  There would perhaps be 200 processes waiting to accept a new
    connection.  Without a fix the result was 200x40 = 8000 wakeups/second
    which is significant.  Not only that, but the wakeup's themselves were
    O(N) resulting in O(2*N^2) operation.  All sorts of scaling problems 
    occured inside the kernel!

    The solution Apache takes is to surround the accept() with a file lock
    so only one process blocks in accept() at any given point (the file lock
    uses wakeup_one and is safe).

    The solution that I took with BestWWWD was to have just one process 
    accept all the connections and then have it dole the descriptor out to the
    appropriate sub-processes over a unix-domain socket.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message