From owner-freebsd-net@FreeBSD.ORG Sat Apr 12 02:42:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0949792; Sat, 12 Apr 2014 02:42:29 +0000 (UTC) Received: from mail-la0-x22e.google.com (mail-la0-x22e.google.com [IPv6:2a00:1450:4010:c03::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 462E11B92; Sat, 12 Apr 2014 02:42:28 +0000 (UTC) Received: by mail-la0-f46.google.com with SMTP id hr17so4119151lab.33 for ; Fri, 11 Apr 2014 19:42:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=9nCmvqdJAz/XC3o2M7hgF+j1x2+7bnxzVIqkYm2F+sw=; b=rvNX5vVTmYS7KypskkQD7D2QlF3lL6AF1p79ZOOZRESy55olzfH8EhP2G15hnbZexq 5XJHlrjg+m7vuYQnfM7tqmykA+UMGcZcD6A4ACRJJ14SK+CFkLKUCPegV8YZxeY2KeZS SnSTevSonRuw/tRpEN/tR/GNCQiBpFPugn/tkDqoPW43lWCXoVNEsq5SsbjigTh+NknU XFq8h0sVHG/Pfh2+R9okO7KE9NUHWuweuawA8fb2RqheFqPcfVdrTdK7qAlR+QvY48hB CIeM32qN+INUQD+E6XjYI84QU+6gVqBciJHxER4sSkgZQNSnEXUA32hKq1j7axQsRbsa p8mw== MIME-Version: 1.0 X-Received: by 10.112.205.35 with SMTP id ld3mr18011437lbc.1.1397270546104; Fri, 11 Apr 2014 19:42:26 -0700 (PDT) Sender: pkelsey@gmail.com Received: by 10.112.141.196 with HTTP; Fri, 11 Apr 2014 19:42:26 -0700 (PDT) In-Reply-To: References: Date: Fri, 11 Apr 2014 22:42:26 -0400 X-Google-Sender-Auth: Ak4T-5LrPi8YahSa5YqSDXaYJdM Message-ID: Subject: Re: netisr observations From: Patrick Kelsey To: hiren panchasara Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: "freebsd-net@freebsd.org" , Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Apr 2014 02:42:29 -0000 On Fri, Apr 11, 2014 at 8:23 PM, hiren panchasara < hiren.panchasara@gmail.com> wrote: > On Fri, Apr 11, 2014 at 11:30 AM, Patrick Kelsey wrote: > > > > The output of netstat -Q shows IP dispatch is set to default, which is > > direct (NETISR_DISPATCH_DIRECT). That means each IP packet will be > > processed on the same CPU that the Ethernet processing for that packet > was > > performed on, so CPU selection for IP packets will not be based on > flowid. > > The output of netstat -Q shows Ethernet dispatch is set to direct > > (NETISR_DISPATCH_DIRECT if you wind up reading the code), so the Ethernet > > processing for each packet will take place on the same CPU that the > driver > > receives that packet on. > > > > For the igb driver with queues autoconfigured and msix enabled, as the > > sysctl output shows you have, the driver will create a number of queues > > subject to device limitations, msix message limitations, and the number > of > > CPUs in the system, establish a separate interrupt handler for each one, > and > > bind each of those interrupt handlers to a separate CPU. It also > creates a > > separate single-threaded taskqueue for each queue. Each queue interrupt > > handler sends work to its associated taskqueue when the interrupt fires. > > Those taskqueues are where the Ethernet packets are received and > processed > > by the driver. The question is where those taskqueue threads will be > run. > > I don't see anything in the driver that makes an attempt to bind those > > taskqueue threads to specific CPUs, so really the location of all of the > > packet processing is up to the scheduler (i.e., arbitrary). > > > > The summary is: > > > > 1. the hardware schedules each received packet to one of its queues and > > raises the interrupt for that queue > > 2. that queue interrupt is serviced on the same CPU all the time, which > is > > different from the CPUs for all other queues on that interface > > 3. the interrupt handler notifies the corresponding task queue, which > runs > > its task in a thread on whatever CPU the scheduler chooses > > 4. that task dispatches the packet for Ethernet processing via netisr, > which > > processes it on whatever the current CPU is > > 5. Ethernet processing dispatches that packet for IP processing via > netisr, > > which processes it on whatever the current CPU is > > I really appreciate you taking time and explaining this. Thank you. > Sure thing. I've had my head in the netisr code frequently lately, and it's nice to be able to share :) > > I am specially confused with ip "Queued" column from netstat -Q > showing 203888563 only for cpu3. Does this mean that cpu3 queues > everything and then distributes among other cpus? Where does this > queuing on cpu3 happens out of 5 stages you mentioned above? > > This value gets populated in snwp->snw_queued field for each cpu > inside sysctl_netisr_work(). > The way your system is configured, all inbound packets are being direct-dispatched. Those packets will bump the dispatched and handled counters, but not the queued counter. The queued counter only gets bumped when something is queued to a netisr thread. You can figure out where that is happening, despite everything apparently being configured for direct dispatch, by looking at where netisr_queue() and netisr_queue_src() are being called from. netisr_queue() is called during ipv6 forwarding and output and ipv4 output when the destination is a local address, gre processing, route socket processing, if_simloop() (which is called to loop back multicast packets, for example)... netisr_queue_src() is called during ipsec and divert processing. One thing to consider also when thinking about what the netisr per-cpu counters represent is that netisr really maintains per-cpu workstream context, not per-netisr-thread. Direct-dispatched packets contribute to the statistics of the workstream context of whichever CPU they are being direct-dispatched on. Packets handled by a netisr thread contribute to the statistics of the workstream context of the CPU it was created for, whether or not it was bound to, or is currently running on, that CPU. So when you look at the statistics in netstat -Q output for CPU 3, dispatched is the number of packets direct-dispatched on CPU 3, queued is the number of packets queued to the netisr thread associated with CPU 3 (but that may be running all over the place if net.isr.bindthreads is 0), and handled is the number of packets processed directly on CPU 3 or in the netisr thread associated with CPU3. > > > > > You might want to try changing the default netisr dispatch policy to > > 'deferred' (sysctl net.isr.dispatch=deferred). If you do that, the > Ethernet > > processing will still happen on an arbitrary CPU chosen by the scheduler, > > but the IP processing should then get mapped to a CPU based on the flowid > > assigned by the driver. Since igb assigns flowids based on received > queue > > number, all IP (and above) processing for that packet should then be > > performed on the same CPU the queue interrupt was bound to. > > I will give this a try and see how things behave. > > I was also thinking about net.isr.bindthreads. netisr_start_swi() does > intr_event_bind() if we have it bindthreads set to 1. What would that > gain me, if anything? > > That's a good point. If you move to deferred dispatch and bind the threads, then you keep the interrupt processing and IP-and-above protocol processing for packets from a given igb queue on the same CPU always. If you don't bind the netisr threads, then all IP-and-above protocol processing for packets from a given igb queue will always happen in the same netisr thread and you will get whatever locality benefits the scheduler manages to give you. I think the choice depends on what else you have going on in the system and what your priorities are. Binding the netisr threads will get you the best locality benefits for input packet processing, but might create hot-spot problems if you have other system activities you want bound to certain CPUs from an overlapping CPU set. Not binding the netisr threads probably gives up some locality benefits in packet processing, but the scheduler can move the network processing work away from other workloads you might have bound to some CPUs (and might care more about getting the locality benefit). > Would it stop moving intr{swi1: netisr 3} on to different cpus (as I > am seeing in 'top' o/p) and bind it to a single cpu? > > Yes it would. > I've came across a thread discussing some side-effects of this though: > http://lists.freebsd.org/pipermail/freebsd-hackers/2012-January/037597.html > > Looks like the suggested fix was incorporated into the kernel about a month after that thread (so, 2 years ago) in r230984 ( http://svnweb.freebsd.org/base?view=revision&revision=230984). That's in 10-stable as well as -current. -Patrick