From owner-freebsd-net@FreeBSD.ORG Tue Jul 8 07:54:45 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E46551065671 for ; Tue, 8 Jul 2008 07:54:45 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id B63038FC18 for ; Tue, 8 Jul 2008 07:54:45 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 4358346B8C; Tue, 8 Jul 2008 03:54:45 -0400 (EDT) Date: Tue, 8 Jul 2008 08:54:45 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Artem Belevich In-Reply-To: Message-ID: <20080708085227.J31157@fledge.watson.org> References: <4867420D.7090406@gtcomm.net> <2d3001c8def1$f4309b90$020b000a@bartwrkstxp> <486FFF70.3090402@gtcomm.net> <48701921.7090107@gtcomm.net> <4871E618.1080500@freebsd.org> <20080708002228.G680@besplex.bde.org> <48724238.2020103@freebsd.org> <20080708034304.R21502@delplex.bde.org> <20080708045135.V1022@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: FreeBSD Net Subject: Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jul 2008 07:54:46 -0000 On Mon, 7 Jul 2008, Artem Belevich wrote: > As was already mentioned, we can't avoid all cache misses as there's data > that's recently been updated in memory via DMA and therefor kicked out of > cache. > > However, we may hide some of the latency penalty by prefetching > 'interesting' data early. I.e. we know that we want to access some ethernet > headers, so we may start pulling relevant data into cache early. Ideally, by > the time we need to access the field, it will already be in the cache. When > we're counting nanoseconds per packet this may bring some performance gain. There were some patches floating around for if_em to do a prefetch of the first bit of packet data on packets before handing them up the stack. My understanding is that they moved the hot spot earlier, but didn't make a huge difference because it doesn't really take that long to get to the point where you're processing the IP header in our current stack (a downside to optimization...). However, that's a pretty anecdotal story, and a proper study of the effects of prefetching would be most welcome. One thing that I'd really like to see someone look at is whether, by doing a bit of appropriately timed prefetching, we can move cache misses out from under hot locks that don't really relate to the data being prefetched. Robert N M Watson Computer Laboratory University of Cambridge