From owner-freebsd-hackers Thu Nov 18 9: 0:11 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 9FADC15432 for ; Thu, 18 Nov 1999 09:00:08 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id JAA85880; Thu, 18 Nov 1999 09:00:07 -0800 (PST) (envelope-from dillon) Date: Thu, 18 Nov 1999 09:00:07 -0800 (PST) From: Matthew Dillon Message-Id: <199911181700.JAA85880@apollo.backplane.com> To: Zach Brown Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: mbuf wait code (revisited) -- review? References: Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :> >(sysctl-ized) in FBSD (Some work have been done in Linux, since a :> >well-known comparative benchmark offense). Would be even more usefull :> >in SMP context. : :I don't think the wake-many problem was ever the cause of the poor numbers :that comparitve benchmark unearthed. This is only a problem if you have a :whole slew of children sitting around waiting for new connections, rather :than doing real work. this sure isn't the environment a heavily loaded :server is under :) If you're still curious, check out : :http://www.kegel.com/mindcraft_redux.html : :specifically : :http://kernelnotes.org/lnxlists/linux-kernel/lk_9906_04/msg01100.html : :-- zach Well, the wake-many problem hit me several times at BEST both with Apache and with the WWW server I wrote. We had the problem under both FreeBSD and IRIX. These were heavily loaded web servers and the wakeup issue turned into an O(N^2) problem. Every time a connection was accepted it woke up N processes where N effectively scaled to the connection rate. For example, shellx (the IRIX box) was getting 40 connections/second and had over 600 active connections at any given moment. There would perhaps be 200 processes waiting to accept a new connection. Without a fix the result was 200x40 = 8000 wakeups/second which is significant. Not only that, but the wakeup's themselves were O(N) resulting in O(2*N^2) operation. All sorts of scaling problems occured inside the kernel! The solution Apache takes is to surround the accept() with a file lock so only one process blocks in accept() at any given point (the file lock uses wakeup_one and is safe). The solution that I took with BestWWWD was to have just one process accept all the connections and then have it dole the descriptor out to the appropriate sub-processes over a unix-domain socket. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message