From owner-freebsd-net@FreeBSD.ORG Fri Apr 27 20:06:59 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 89CE81065670; Fri, 27 Apr 2012 20:06:59 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id D76018FC24; Fri, 27 Apr 2012 20:06:58 +0000 (UTC) Received: by wgbds12 with SMTP id ds12so1003476wgb.31 for ; Fri, 27 Apr 2012 13:06:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=CIzFO3kh69vqKIddIhAcUxS0doA0GM481nQnD2a/aR0=; b=SvtVUNyw7d/m7d8iH/nsRbYqFdkdD6tMrIrIAVccmvKdyuva2LQWEvzK6/94uFAdX/ clTs/AfVZMXUdcuX9V8TuWjsd18GZ6xAgMh6dkDIXaTjgMZOuET5yPZzSJC8BYv6jG0h Z02Ge3FpOpAXVvEuvscATQinRj858lDZSeaqLKFKbCDqbn7kTqPbObH5eDXepk8b0pKn cMpQAukl2cmUUjQObqb0B0GG80/Zc/32J3UQfoRRfIw0W6J/hYFTy9tgGFAOJYnoR0B9 uiVJ73119ukv1WcPMtnAZAC9Ws5zhNjiDzqGUr6V/WIm6wGLpJCVuyoMHUxGXldTRb3I Y1Wg== MIME-Version: 1.0 Received: by 10.216.137.22 with SMTP id x22mr8829407wei.69.1335557217793; Fri, 27 Apr 2012 13:06:57 -0700 (PDT) Received: by 10.180.145.162 with HTTP; Fri, 27 Apr 2012 13:06:57 -0700 (PDT) In-Reply-To: References: <1335463643.2727.10.camel@powernoodle-l7.corp.yahoo.com> <1335554950.9324.3.camel@powernoodle-l7.corp.yahoo.com> Date: Fri, 27 Apr 2012 13:06:57 -0700 Message-ID: From: Jack Vogel To: Juli Mallett Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: "freebsd-net@freebsd.org" , Sean Bruno Subject: Re: igb(4) at peak in big purple X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Apr 2012 20:06:59 -0000 I suspect to do it right would involve having the stack/kernel have more interaction with the driver/interface data, and this IS the way RSS was envisioned to work. Its been talked about but hasn't happened so far. Jack On Fri, Apr 27, 2012 at 1:00 PM, Juli Mallett wrote: > On Fri, Apr 27, 2012 at 12:29, Sean Bruno wrote: > > On Thu, 2012-04-26 at 11:13 -0700, Juli Mallett wrote: > >> Queue splitting in Intel cards is done using a hash of protocol > >> headers, so this is expected behavior. This also helps with TCP and > >> UDP performance, in terms of keeping packets for the same protocol > >> control block on the same core, but for other applications it's not > >> ideal. If your application does not require that kind of locality, > >> there are things that can be done in the driver to make it easier to > >> balance packets between all queues about-evenly. > > > > Oh? :-) > > > > What should I be looking at to balance more evenly? > > Dirty hacks are involved :) I've sent some code to Luigi that I think > would make sense in netmap (since for many tasks one's going to do > with netmap, you want to use as many cores as possible, and maybe > don't care about locality so much) but it could be useful in > conjunction with the network stack, too, for tasks that don't need a > lot of locality. > > Basically this is the deal: the Intel NICs hash of various header > fields. Then, some bits from that hash are used to index a table. > That table indicates what queue the received packet should go to. > Ideally you'd want to use some sort of counter to index that table and > get round-robin queue usage if you wanted to evenly saturate all > cores. Unfortunately there doesn't seem to be a way to do that. > > What you can do, though, is regularly update the table that is indexed > by hash. Very frequently, in fact, it's a pretty fast operation. So > what I've done, for example, is to go through an rotate all of the > entries every N packets, where N is something like the number of > receive descriptors per queue divided by the number of queues. So > bucket 0 goes to queue 0 and bucket 1 goes to queue 1 at first. Then > a few hundred packets are received, and the table is reprogrammed, so > now bucket 0 goes to queue 1 and bucket 1 goes to queue 0. > > I can provide code to do this, but I don't want to post it publicly > (unless it is actually going to become an option for netmap) for fear > that people will use it in scenarios where it's harmful and then > complain. It's potentially one more painful variation for the Intel > drivers that Intel can't support, and that just makes everyone > miserable. > > Thanks, > Juli. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >