From owner-freebsd-arch@FreeBSD.ORG Fri Sep 13 15:08:25 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E57EE4D9; Fri, 13 Sep 2013 15:08:24 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id ADAE52829; Fri, 13 Sep 2013 15:08:24 +0000 (UTC) Received: from [209.249.190.124] (port=63489 helo=gnnmac.hudson-trading.com) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80.1) (envelope-from ) id 1VKUys-0006KF-N7; Fri, 13 Sep 2013 11:08:23 -0400 Content-Type: multipart/signed; boundary="Apple-Mail=_C7AE7CBE-E315-44DA-B15B-4A00DFC704F3"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: Network stack changes From: George Neville-Neil In-Reply-To: Date: Fri, 13 Sep 2013 11:08:27 -0400 Message-Id: <6BDA4619-783C-433E-9819-A7EAA0BD3299@neville-neil.com> References: <521E41CB.30700@yandex-team.ru> To: Adrian Chadd X-Mailer: Apple Mail (2.1508) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com X-Get-Message-Sender-Via: vps.hungerhost.com: authenticated_id: gnn@neville-neil.com Cc: "Alexander V. Chernikov" , Luigi Rizzo , Andre Oppermann , "freebsd-hackers@freebsd.org" , "freebsd-arch@freebsd.org" , "Andrey V. Elsukov" , Gleb Smirnoff , FreeBSD Net X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Sep 2013 15:08:25 -0000 --Apple-Mail=_C7AE7CBE-E315-44DA-B15B-4A00DFC704F3 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On Aug 29, 2013, at 7:49 , Adrian Chadd wrote: > Hi, >=20 > There's a lot of good stuff to review here, thanks! >=20 > Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to = keep > locking things like that on a per-packet basis. We should be able to = do > this in a cleaner way - we can defer RX into a CPU pinned taskqueue = and > convert the interrupt handler to a fast handler that just schedules = that > taskqueue. We can ignore the ithread entirely here. >=20 > What do you think? >=20 > Totally pie in the sky handwaving at this point: >=20 > * create an array of mbuf pointers for completed mbufs; > * populate the mbuf array; > * pass the array up to ether_demux(). >=20 > For vlan handling, it may end up populating its own list of mbufs to = push > up to ether_demux(). So maybe we should extend the API to have a = bitmap of > packets to actually handle from the array, so we can pass up a larger = array > of mbufs, note which ones are for the destination and then the upcall = can > mark which frames its consumed. >=20 > I specifically wonder how much work/benefit we may see by doing: >=20 > * batching packets into lists so various steps can batch process = things > rather than run to completion; > * batching the processing of a list of frames under a single lock = instance > - eg, if the forwarding code could do the forwarding lookup for 'n' = packets > under a single lock, then pass that list of frames up to = inet_pfil_hook() > to do the work under one lock, etc, etc. >=20 > Here, the processing would look less like "grab lock and process to > completion" and more like "mark and sweep" - ie, we have a list of = frames > that we mark as needing processing and mark as having been processed = at > each layer, so we know where to next dispatch them. >=20 One quick note here. Every time you increase batching you may increase = bandwidth but you will also increase per packet latency for the last packet in a = batch. That is fine so long as we remember that and that this is a tuning knob to balance the two. > I still have some tool coding to do with PMC before I even think about > tinkering with this as I'd like to measure stuff like per-packet = latency as > well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P / > lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.) >=20 This would be very useful in identifying the actual hot spots, and would = be helpful to anyone who can generate a decent stream of packets with, say, an = IXIA. Best, George --Apple-Mail=_C7AE7CBE-E315-44DA-B15B-4A00DFC704F3 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iEYEARECAAYFAlIzKmsACgkQYdh2wUQKM9Lk2QCeLeRhFPb5zHPhQ4hHJ+H/JXWv OR0AoMDJ9iHjwtGg4DblcC0ZSmxt/noE =gAUE -----END PGP SIGNATURE----- --Apple-Mail=_C7AE7CBE-E315-44DA-B15B-4A00DFC704F3--