From owner-freebsd-current@FreeBSD.ORG Thu Mar 1 14:54:35 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E892516A406 for ; Thu, 1 Mar 2007 14:54:35 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 5FE4213C4B7 for ; Thu, 1 Mar 2007 14:54:35 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 32239 invoked from network); 1 Mar 2007 14:00:38 -0000 Received: from dotat.atdotat.at (HELO [62.48.0.47]) ([62.48.0.47]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 1 Mar 2007 14:00:38 -0000 Message-ID: <45E6E2E8.5060408@freebsd.org> Date: Thu, 01 Mar 2007 15:27:52 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8b) Gecko/20050217 MIME-Version: 1.0 To: freebsd-current@freebsd.org, freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: gallatin@freebsd.org, rwatson@freebsd.org, kmacy@freebsd.org Subject: Large TCP send socket buffer optimizations X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Mar 2007 14:54:36 -0000 With the TCP socket buffer autosizing and generally larger socket buffers for high bandwidth and high delay connections tcp_output() has become increasingly inefficient for sending segments. For every segment sent it is traversing the entire socket buffer mbuf chain until it finds the offset to continue from. Usually this is close to the end of the chain. Once it got past a few dozen mbufs it starts to bust the CPU caches and performance starts to fall off. This patch solves the problem by maintaining an offset pointer in the socket buffer to give tcp_output() the closest mbuf right away avoiding the traversal from the beginning. With this patch we should be able to compete nicely for the Internet land speed record again. The patch is here: http://people.freebsd.org/~andre/sockbuf_sndptr-20070301.diff Any testing, especially on 10Gig cards, and feedback appreciated. -- Andre