From owner-freebsd-net@FreeBSD.ORG Fri Sep 23 13:59:37 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E483E1065673; Fri, 23 Sep 2011 13:59:37 +0000 (UTC) (envelope-from syuu@dokukino.com) Received: from mail-pz0-f44.google.com (mail-pz0-f44.google.com [209.85.210.44]) by mx1.freebsd.org (Postfix) with ESMTP id A5AD28FC18; Fri, 23 Sep 2011 13:59:37 +0000 (UTC) Received: by pzk32 with SMTP id 32so14688421pzk.3 for ; Fri, 23 Sep 2011 06:59:37 -0700 (PDT) Received: by 10.68.27.102 with SMTP id s6mr10166515pbg.43.1316786377078; Fri, 23 Sep 2011 06:59:37 -0700 (PDT) Received: from [126.219.247.163] (pw126219247163.55.tss.panda-world.ne.jp. [126.219.247.163]) by mx.google.com with ESMTPS id q10sm38619881pbn.9.2011.09.23.06.59.33 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 23 Sep 2011 06:59:35 -0700 (PDT) References: <1315221674.3092.282.camel@deadeye> <201109080834.11607.jhb@freebsd.org> <20110908184928.GA87872@hub.freebsd.org> <37419C45-4436-4738-851B-2B765BC2C60F@neville-neil.com> <1315529074.2804.63.camel@bwh-desktop> In-Reply-To: Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <8C8C6061-2BD8-4B38-843E-A0BA1218B773@dokukino.com> X-Mailer: iPhone Mail (9A5313e) From: Takuya ASADA Date: Fri, 23 Sep 2011 22:59:26 +0900 To: "owner-freebsd-net@freebsd.org" Cc: "support@pvd.citizen.co.jp" , "jfv@freebsd.org" , John Baldwin , "freebsd-net@freebsd.org" Subject: Re: Adding Flow Director sysctls to ixgbe(4) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Sep 2011 13:59:38 -0000 Hi, On Sep 9, 2011, at 10:56 AM, owner-freebsd-net@freebsd.org wrote: > On Fri, Sep 09, 2011 at 01:44:34AM +0100, Ben Hutchings wrote: >> On Thu, 2011-09-08 at 20:13 -0400, George Neville-Neil wrote: >>> On Sep 8, 2011, at 14:49 , Navdeep Parhar wrote: >>>=20 >>>> On Thu, Sep 08, 2011 at 08:34:11AM -0400, John Baldwin wrote: >>>>> On Monday, September 05, 2011 7:21:12 am Ben Hutchings wrote: >>>>>> On Mon, 2011-09-05 at 15:51 +0900, Takuya ASADA wrote: >>>>>>> Hi, >>>>>>>=20 >>>>>>> I implemented Ethernet Flow Director sysctls to ixgbe(4), here's a d= etail: >>>>>>>=20 >>>>>>> - Adding removing signature filter >>>>>>> On linux version of ixgbe driver, it has ability to set/remove perfe= ct >>>>>>> filter from userland using ethtool command. >>>>>>> I implemented similar feature, but on sysctl, and not perfect filter= >>>>>>> but signature filter(which means hash collision may occurs). >>>>>> [...] >>>>>>=20 >>>>>> Linux also has a generic interface to RX filtering and hashing >>>>>> (ethtool_rxnfc) which ixgbe supports; wouldn't it be better for FreeB= SD >>>>>> to support something like that? >>>>>=20 >>>>> Some sort of shared interface might be nice. The cxgb(4) and cxgbe(4)= drivers >>>>> both provide their own tools to manipulate filters, though they do not= >>>>> provide explicit steering IIRC. >>>>=20 >>>> Both of them can filter as well as steer (and the tools let you do that= ). >>>> cxgbe(4) can do a lot more (rewrite + switch, replicate, etc.) but thos= e >>>> features are perhaps too specialized to be configurable via a general >>>> purpose tool. >>>>=20 >>>>>=20 >>>>> We would need to come up with some sort of standard interface (ioctls?= ) for=20 >>>>> adding filters however. >>>>=20 >>>> +1 for a standard interface. >>>>=20 >>>> imho the kernel needs to be aware of the rx and tx queues of a NIC, and= >>>> not just for steering. But that's a separate discussion. >>>>=20 >>>=20 >>> Well I do think this is actually all of a part. Most of us realize by n= ow that >>> high speed (e.g. 10G and higher) NICs only make sense if you can steer t= raffic and >>> pin queues to cores etc. >>=20 >> Well, you can get way better than 1G performance without that. And for >> routers, flow hashing may be fine. But for a host, of course, steering >> packets properly can provide a major performance win. >>=20 >> [...] >>> What this means is that we have >>> a failure of abstraction. Abstraction has a cost, and some of the peopl= e who want >>> access to low level queues are not interested in paying an extra abstrac= tion cost. >>=20 >> Abstraction has a cost, but it's not necessarily that high compared to >> rewriting a whole chunk of sockets code (especially if you don't >> actually have the source code). >>=20 >>> I think that some of the abstractions we need are tied up in the work th= at Takuya did >>> for SoC and some of it is in the work done by Luigi on netmap. I'd go s= o far as to say >>> that what we should do is try to combine those two pieces of code into a= set of >>> low level APIs for programs to interact with high speed NICs. The one t= hing most >>> people do not talk about is extending our socket API to do two things th= at I think would >>> be a win for 80% of our users. If a socket, and also a kqueue, could be= pinned >>> to a CPU as well as a NIC queue that should improve overall bandwidth fo= r a large >>> number of our users. The API there is definitely an ioctl() and the har= d part is >>> doing the tying together. To do this we need to also work out our low l= evel story. >>=20 >> But it would be a lot nicer if this could be done automatically. Which >> I believe it can - see the RFS and XPS features in Linux. >=20 > rwatson@ has been working on "connection groups" (not sure what he calls > his project) with a goal to improve the placement of work in the FreeBSD > network stack. Some of the code is in the kernel but the parts that > require closer cooperation with a NIC are not. It looks like reducing lock contention on inpcb lookup, does it even effects= the other part? (ex: CPU affinity