From owner-freebsd-net@FreeBSD.ORG Mon Dec 13 17:30:57 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3113316A4CF for ; Mon, 13 Dec 2004 17:30:57 +0000 (GMT) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4C29443D46 for ; Mon, 13 Dec 2004 17:30:56 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 19928 invoked from network); 13 Dec 2004 17:19:57 -0000 Received: from dotat.atdotat.at (HELO [62.48.0.47]) ([62.48.0.47]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 13 Dec 2004 17:19:57 -0000 Message-ID: <41BDD1C7.7060105@freebsd.org> Date: Mon, 13 Dec 2004 18:30:47 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a5) Gecko/20041122 X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-current@freebsd.org References: <41BA0088.9000107@freebsd.org> In-Reply-To: <41BA0088.9000107@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-net@freebsd.org cc: gallatin@cs.duke.edu Subject: Re: Rewritten TCP reassembly X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Dec 2004 17:30:57 -0000 Andre Oppermann wrote: > I've totally rewritten the TCP reassembly function to be a lot more > efficient. In tests with normal bw*delay products and packet loss > plus severe reordering I've measured an improvment of at least 30% in > performance. For high and very high bw*delay product links the > performance improvement is most likely much higher. > > The main property of the new code is O(1) insert for 95% of all normal > reassembly cases. If there is more than one hole the insert time is > O(holes). If a packet arrives that closes a hole the chains to the left > and right are merged. Artificially constructed worst case is O(n). No > malloc's are done for new segments. The old code was O(n) in all cases > plus n*malloc for a describing structure. > > There are some problems with the new code I will fix before committing > it to the tree. One is it can't handle non-writeable mbuf's and the > other is too little leading space in the mbuf (found only on loopback > interface, but there we don't have packet loss). Once these two are > dealed with it is ready to go in. > > Nothing is perfect and this code is only a first significant step over > what we have currently in the tree, especially for transfers over lossy > (wireless) and high speed links with and without packet reordering. > I have the next steps already in the works which will further optimize > (worst case O(windowsize/mclusters) instead of O(n)) and simplify a bit > more again. > > The patch can be found here: > > http://www.nrg4u.com/freebsd/tcp_reass-20041210.patch > > Please test and report good and bad news back. I've got some excellent review feedback from Mike Spengler and he found a off-by-one queue limit tracking error. http://www.nrg4u.com/freebsd/tcp_reass-20041213.patch Please test again. -- Andre