From owner-freebsd-current@FreeBSD.ORG Tue Apr 24 13:17:03 2012 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92AFC106566C for ; Tue, 24 Apr 2012 13:17:03 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id F369B8FC1C for ; Tue, 24 Apr 2012 13:17:02 +0000 (UTC) Received: (qmail 43894 invoked from network); 24 Apr 2012 13:11:16 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 24 Apr 2012 13:11:16 -0000 Message-ID: <4F96A7C0.3010909@freebsd.org> Date: Tue, 24 Apr 2012 15:16:48 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 MIME-Version: 1.0 To: Luigi Rizzo References: <20120419133018.GA91364@onelab2.iet.unipi.it> <4F907011.9080602@freebsd.org> <20120419204622.GA94904@onelab2.iet.unipi.it> In-Reply-To: <20120419204622.GA94904@onelab2.iet.unipi.it> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: current@freebsd.org, net@freebsd.org Subject: Re: Some performance measurements on the FreeBSD network stack X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Apr 2012 13:17:03 -0000 On 19.04.2012 22:46, Luigi Rizzo wrote: > On Thu, Apr 19, 2012 at 10:05:37PM +0200, Andre Oppermann wrote: >> On 19.04.2012 15:30, Luigi Rizzo wrote: >>> I have been running some performance tests on UDP sockets, >>> using the netsend program in tools/tools/netrate/netsend >>> and instrumenting the source code and the kernel do return in >>> various points of the path. Here are some results which >>> I hope you find interesting. >>> - another big bottleneck is the route lookup in ip_output() >>> (between entries 51 and 56). Not only it eats another >>> 100ns+ on an empty routing table, but it also >>> causes huge contentions when multiple cores >>> are involved. >> >> This is indeed a big problem. I'm working (rough edges remain) on >> changing the routing table locking to an rmlock (read-mostly) which > > i was wondering, is there a way (and/or any advantage) to use the > fastforward code to look up the route for locally sourced packets ? I've completed the updating of the routing table rmlock patch. There are two steps. Step one is just changing the rwlock to an rmlock. Step two streamlines the route lookup in ip_output and ip_fastfwd by copying out the relevant data while only holding the rmlock instead of obtaining a reference to the route. Would be very interesting to see how your benchmark/profiling changes with these patches applied. http://svn.freebsd.org/changeset/base/234649 Log: Change the radix head lock to an rmlock (read mostly lock). There is some header pollution going on because rmlock's are not entirely abstracted and need per-CPU structures. A comment in _rmlock.h says this can be hidden if there were per-cpu linker magic/support. I don't know if we have that already. http://svn.freebsd.org/changeset/base/234650 Log: Add a function rtlookup() that copies out the relevant information from an rtentry instead of returning the rtentry. This avoids the need to lock the rtentry and to increase the refcount on it. Convert ip_output() to use rtlookup() in a simplistic way. Certain seldom used functionality may not work anymore and the flowtable isn't available at the moment. Convert ip_fastfwd() to use rtlookup(). This code is meant to be used for profiling and to be experimented with further to determine which locking strategy returns the best results. Make sure to apply this one as well: http://svn.freebsd.org/changeset/base/234648 Log: Add INVARIANT and WITNESS support to rm_lock locks and optimize the synchronization path by replacing a LIST of active readers with a TAILQ. Obtained from: Isilon Submitted by: mlaier -- Andre