From owner-freebsd-net@FreeBSD.ORG Sat Aug 24 12:49:55 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D7FF07EF; Sat, 24 Aug 2013 12:49:55 +0000 (UTC) (envelope-from melifaro@ipfw.ru) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9AB172CD4; Sat, 24 Aug 2013 12:49:55 +0000 (UTC) Received: from secured.by.ipfw.ru ([95.143.220.47] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1VD9XQ-000CTx-8S; Sat, 24 Aug 2013 12:49:40 +0400 Message-ID: <5218ABB4.5070601@ipfw.ru> Date: Sat, 24 Aug 2013 16:48:52 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130728 Thunderbird/17.0.7 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)) References: <520A6D07.5080106@freebsd.org> <520B74DD.1060102@ipfw.ru> <20130814124024.GA64548@onelab2.iet.unipi.it> <201308141740.28779.zec@fer.hr> <20130814154853.GA66341@onelab2.iet.unipi.it> <521204A9.7080607@ipfw.ru> <52152837.9010101@freebsd.org> In-Reply-To: <52152837.9010101@freebsd.org> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Aug 2013 12:49:55 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 22.08.2013 00:51, Andre Oppermann wrote: > On 19.08.2013 13:42, Alexander V. Chernikov wrote: >> On 14.08.2013 19:48, Luigi Rizzo wrote: >>> On Wed, Aug 14, 2013 at 05:40:28PM +0200, Marko Zec wrote: >>>> On Wednesday 14 August 2013 14:40:24 Luigi Rizzo wrote: >>>>> On Wed, Aug 14, 2013 at 04:15:25PM +0400, Alexander V. >>>>> Chernikov wrote: >>> ... >>>> FWIW, apparently we already have that infrastrucure in place >>>> - if_rele() calls if_free_internal() only when the last >>>> reference to the ifnet is dropped, so with little care this >>>> should be usable for caching ifp pointers w/o fears for >>>> kernel crashes mentioned above. >>> maybe Alexander was referring to holding references to the rte >>> entries returned as a result of the lookup. The rte holds a >>> reference to the ifp. >> >> Yes. Since there is the only refcount which is protected (and is >> also a huge performance killer). >> >> Btw, there is a picture describing IPv4 packet flow from my >> still-not-written post related network stack performance, maybe >> it can be useful: >> http://static.ipfw.ru/images/freebsd_ipv4_flow.png > > Wow, that's really cool. Please note that a rmlock doesn't cost > anything for the read case (unless contended of course). Whereas > normal rlocks or We're running this entire stack without singe rwlock (everything is either converted to rmlock or using lockless data copies with delayed GC (in_adrr_local and other similar)). It really is fasters, but, however, due to current process-to-completion routing architecture this is limited to 5-6MPPS for 12 cores on 2xE5645. > rwlocks write to the lock memory location and cause atomic bus lock > cycles as well as a lot of cache line invalidations across cores. > The same is true for refcounts. > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIYq7QACgkQwcJ4iSZ1q2nFZwCfZLckg4b/iny2CK+bYJa20XxE y7UAnRZHVr4AZRYnB8acrN54KtRMpvNQ =0kPb -----END PGP SIGNATURE-----