From owner-freebsd-questions@FreeBSD.ORG Fri Sep 3 12:07:37 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8B32016A4CE for ; Fri, 3 Sep 2004 12:07:37 +0000 (GMT) Received: from adsl-18-156.swiftdsl.com.au (adsl-19-156.swiftdsl.com.au [218.214.19.156]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9E8CA43D2F for ; Fri, 3 Sep 2004 12:07:36 +0000 (GMT) (envelope-from simon@adsl-18-156.swiftdsl.com.au) Received: by adsl-18-156.swiftdsl.com.au (Postfix, from userid 501) id 267BE1145A; Fri, 3 Sep 2004 22:07:35 +1000 (EST) Date: Fri, 3 Sep 2004 22:07:35 +1000 From: Simon Lai To: freebsd-questions@freebsd.org Message-ID: <20040903120734.GA28796@pobox.com.> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: simon@synatech.com.au User-Agent: Mutt/1.5.5.1i Subject: 100,000 TCP connections - kernel tuning advice wanted X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: simon@synatech.com.au List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Sep 2004 12:07:37 -0000 Hi all, As part of a team, I am working on a TCP multiplexor using FreeBSD. On side A we have 100,000 TCP connections accepting packets, which are multiplexed onto a single TCP connection on Side B. Packets going B->A are demultiplexed in the reverse way. Info - - freebsd version is 5.2-RELEASE. Kernel has been recompiled to use DEVICE_POLLING and unused devices removed. The HZ parameter has been varied through 1000,2000,4000 but this does not significantly alter our results. We have also played with the idle and trap sysctl's for polling. - our network card is an Intel EtherExpress Pro, running at 100Mbits - UDP is not an option for us - Average payload size is 50-100 bytes. The payload is preceeded by a 32 bit value, which is the size of the payload, so reading is a matter of grabbing the size, allocating a buffer and then doing the read. Minimal processing is done on the packet. - We are using our own specialized memory management. We use writev and readv whereever possible. - socket buffers have been increased to 1MB on the B side, but are the default size on side A. - we are using kevent/kqueue - this task would be impossible without them - our current test box has 1.5GB RAM and a 1GHZ Athlon CPU. While we might go for a faster CPU, we would like to keep within our current RAM constraints. - Side A is connected to a test client, which has 20% idle time. - Side B is connected via a switch to another test box, which just echos the packets back for testing purposes. It has significant idle time. - Our current rough measurements, using top, show 30% user time, and 60% kernel time, when this app is running. This multiplexing app is the only app running on the machine. The machine is CPU bound - the multiplexing requires no disk I/O. Currently we are getting 4000-6000 packets/sec unidirectional throughput, depending upon the mix of packet types/sizes. This goes up to 5000-7000 packets/sec for 50,000 connections. We are seeking advice on what kernel tunables we can tweak to improve packet throughput. Constants are TCP, 100,000 connections, 50-100 byte packet sizes. All help appreciated. Regs Simon