Date: Tue, 27 Aug 2013 09:27:57 +0200 From: Andre Oppermann <andre@freebsd.org> To: Adrian Chadd <adrian@freebsd.org> Cc: Jack F Vogel <jfv@freebsd.org>, "Justin T. Gibbs" <gibbs@freebsd.org>, Alan Somers <asomers@freebsd.org>, "net@freebsd.org" <net@freebsd.org> Subject: Re: Flow ID, LACP, and igb Message-ID: <521C54FD.2060109@freebsd.org> In-Reply-To: <CAJ-Vmom8TppCc1%2Bio53cCct17NV=7x374zfE7Zq1ShSZ72bufA@mail.gmail.com> References: <D01A0CB2-B1E3-4F4B-97FA-4C821C0E3FD2@FreeBSD.org> <521BBD21.4070304@freebsd.org> <CAJ-Vmom8TppCc1%2Bio53cCct17NV=7x374zfE7Zq1ShSZ72bufA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 27.08.2013 01:30, Adrian Chadd wrote: > ... is there any reason we wouldn't want to have the TX and RX for a given flow mapped to the same core? They are. Thing is the inbound and outbound packet flow id's are totally independent from each other. The inbound one determines the RX ring it will take to go up the stack. If that's bound to a core that's fine and gives affinity. If the socket and user-space application are bound to the same core as well, there is full affinity. Now on the way down the core doing the write to the socket matters entering the kernel. It stays there until the packet is generated (in tcp_output for example). The flow id of the packet doesn't matter at all so far because it is filled only then. Now the packet goes down the stack and the flow id is only used at the end when it has to decide for an outbound TX queue based on it. This outbound TX ring doesn't have to be same it came in on as long as it stays the same to prevent reordering. This fixes Justin's issue with if_lagg and poor balancing. He can simply choose a good hash for the packets going out and stop worrying about it. More important he's no longer hostage to random switches with poor hashing. Ultimately you could try to bind the TX ring to a particular CPU as well and try to run it lockless. That is fraught with some difficult problems though. First you must have exactly as many RX/TX queues as cores. That's often not the case as there are many cards that only support a limited number of rings. Then for packets generated locally (think DNS query over UDP) you either simply stick to the local cpu-assigned queue to send without looking at the computed flow id or you have to switch cores to send the packet on the correct queue. Such a very strong core binding is typically only really useful in embarrassing parallel applications that only do packet pushing. If your application is also compute intense you may want to have some more flexibility to schedule threads to prevent stalls from busy cores. In that case not binding TX to a core is a win. So we will pretty much end up with one lock per TX ring to protect the DMA descriptor structures. We're still far way from having to worry about this TX issue. The big win is the RX queue - socket - application affinity (to the same core). -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?521C54FD.2060109>