Date: Tue, 19 Jun 2001 09:53:11 -0700 (PDT) From: Matt Dillon <dillon@earth.backplane.com> To: "Ashutosh S. Rajekar" <asr@softhome.net> Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: max kernel memory Message-ID: <200106191653.f5JGrBW30734@earth.backplane.com> References: <Pine.LNX.4.21.0106191316540.1193-100000@vangogh.indranetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:Well, we are building a web accelerator box called WebEnhance, that would :support around a million TCP/IP connections (brag .. brag..). It would :selectively function as a Layer 2/4/7 switch. And its going to run a :kernel proxy, and probably nothing significant in user mode. It might be :diskless or diskfull, depending on how much stuff real webservers throw at :it. If the we can't realise addition of more memory for the kernel, then :I can say that a disk-based cache or user level daemons acting as cache :controllers are in the picture. : :I can make these assumptions about memory requirements: :socket strucures: 200Mb :mbufs + clusters: 400Mb :TCP/IP PCBs : ??? :HTTP requests+responses: 100-200Mb :misc structures: 50Mb : :I can't imagine more than 10% of all connections to be in the active state :at a given point in time - i.e. really sending and receiving data. I am :not really worried about network/CPU limitations, even keeping a 100Mbps :ethernet to full load is gonna be a challenge. Disks are really slow, so :unless some sweeping algorithm is in place, it really means writing HTTP :requests/responses all over the disk and latency, and along with it :throughput becomes a big issue. : :-ASR Sounds pretty cool! I did a diskless web proxy at BEST Internet to handle connections to ~user web sites. We had three front-end machines accepting web connections and then turning around and making LAN connections to the actual web servers (30 or so web servers). A pure diskless proxy can handle a huge number of connections. Ours never had to deal with more then a few thousand distributed across the three boxes and we ran our incoming SMTP proxy on the same boxes (which queued and then forwarded the mail to the shell machines). You have to decide whether you are going to go with a disk cache or not before you design the system, but from the looks of it it sounds to me that you will *NOT* want a disk cache -- certainly not if you are trying to run that many connections through the box. If you think about it, the absolute best a hard drive can do is around a 4ms random seek. This comes to around 250 seek/reads per second which means that a caching proxy, assuming the data is not cached in ram, will not be able to handle more then 250 requests/sec. With the connection load you want to handle, the chance of the data being cacheable in ram is fairly low. So a disk-based caching proxy will drop connection performance by two orders of magnitude. For the diskless case I don't know if you can make it to a million simultanious connections, but Terry has gotten his boxes to do a hundred thousand so we know that at least is doable. But rather then spend a huge amount of time trying to max out the connections you might want to consider distributing the connection load across a farm of front-end machines. The cost is virtually nothing, especially if the machines do not need a lot of disk. At BEST I used a simple DNS round-robin. For a more robust solution a Cisco Redirector or some similar piece of hardware can distribute the connections. This also makes maintainance a whole lot easier - you can take individual machines out of the cluster and then take them down for arbitrary periods of time for maintainance. I would also mock the system up using user-mode processes first, to get an idea of the resource requirements, before spending a lot of time writing kernel modules. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200106191653.f5JGrBW30734>