From owner-freebsd-net@FreeBSD.ORG Fri Apr 20 18:43:40 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6635106566B; Fri, 20 Apr 2012 18:43:40 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 71C7D8FC08; Fri, 20 Apr 2012 18:43:37 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 57D2E7300A; Fri, 20 Apr 2012 21:03:09 +0200 (CEST) Date: Fri, 20 Apr 2012 21:03:09 +0200 From: Luigi Rizzo To: net@freebsd.org, current@freebsd.org Message-ID: <20120420190309.GA5617@onelab2.iet.unipi.it> References: <20120419133018.GA91364@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120419133018.GA91364@onelab2.iet.unipi.it> User-Agent: Mutt/1.4.2.3i Cc: Subject: more network performance info: ether_output() X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Apr 2012 18:43:40 -0000 Continuing my profiling on network performance, another place were we waste a lot of time is if_ethersubr.c::ether_output() In particular, from the beginning of ether_output() to the final call to ether_output_frame() the code takes slightly more than 210ns on my i7-870 CPU running at 2.93 GHz + TurboBoost. In particular: - the route does not have a MAC address (lle) attached, which causes arpresolve() to be called all the times. This consumes about 100ns. It happens also with locally sourced TCP. Using the flowtable cuts this time down to about 30-40ns - another 100ns is spend to copy the MAC header into the mbuf, and then check whether a local copy should be looped back. Unfortunately the code here is a bit convoluted so the header fields are copied twice, and using memcpy on the individual pieces. Note that all the above happens not just with my udp flooding tests, but also with regular TCP traffic. cheers luigi