Date: Sun, 9 Jun 2013 17:04:49 -0400 From: George Neville-Neil <gnn@neville-neil.com> To: "hackers@freebsd.org" <hackers@freebsd.org> Cc: "devsummit@freebsd.org" <devsummit@freebsd.org> Subject: Network Recieve Performance Working Group Message-ID: <8537DE82-46F4-4E11-AECA-42F118AB179F@neville-neil.com>
next in thread | raw e-mail | index | archive | help
Howdy, At the Network Receive Performance working group at BSDCan we covered a = narrower set of topics than we normally do, which seems to have resulted in a reasonably sized = work list for improving our systems in this area. The main issues relate to getting a good API = that addresses multi-queue NICs. The notes are on the WIki page as well as reproduced here. Best, George https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance The discussion opened with an attempt to constrain the problem we were = trying to solve, including pointing out that any KPI/API suggested = needed to be achievable in the next six months. Some of the existing solutions to the problem of talking to hardware = with multiple queues, which all high end NICs currently have, were: =95 Connection Groups =95 Not really a KPI =95 RSS vs. Flow Table is an issue to solve, we have = things for the former, but little for the latter =95 Socket affinity is also an issue =95 NAPI =95 This is an APi in Linux. It uses upcalls. =95 Flow table mapping. Chelsio may have some of this. =95 SRIO =95 VLL Cloner There are several ways to map flows, including: 4 tuple, MAC filter, = arbitrary offset. An API that only handles offset, length, value is too = simple from the standpoint of getting the right data into the hardware. = We need something more rich on the kernel side of the API to that driver = writers don't have to figure out our intentions. Some methods that a good KPI/API ought to have include: =95 Query Device for information about its queues, including how = many exist, and how they are mapped to other resources, including CPU = and memory =95 Map CPUID to a Flow =95 Setup RSS =95 Request RxRing local memory =95 Solaris Mapping API might be a way to go = (http://www.oracle.com/technetwork/articles/servers-storage-admin/crossbow= setup-191326.pdf) =95 Some consumers of such an API include: Performance, affinity, = virtualization, policy, kernel bypass, QoS, and VIMAGE. We have two patches, for different bits, to start from including Vijay's = [RobertWatson] and Randall's [RandallStewart], [GeorgeNevilleNeil] We need quite a few things, including: =95 Per connection flow table =95 Describing queues in the stack such that we can expose = interesting parts via netstat. =95 Packet Batching. This was not overwhelmingly popular. A straw person API includes: =95 MBUF Flag =95 Hash Value =95 The whole thing may be used as opaque =95 Used by the stack for inpcb =95 Get number of buckets =95 Map bucket to RSS =95 Map queue/ithread to CPU =95 Get width of the hash =95 RSS get CPU =95 RSS get hash algo =95 Pick hash inputs =95 Get and set key =95 Rebalance =95 Software hash table =95 Query queue length =95 Get queue affinity =95 Set mask (CPUSET) on socket =95 Set policy on CPU/socket =95 Queue event reporting =95 Load distrubtion stats
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8537DE82-46F4-4E11-AECA-42F118AB179F>