From owner-freebsd-arch@freebsd.org Sat Aug 29 01:52:09 2015 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3BA659C540D for ; Sat, 29 Aug 2015 01:52:09 +0000 (UTC) (envelope-from yaneurabeya@gmail.com) Received: from mail-pa0-x236.google.com (mail-pa0-x236.google.com [IPv6:2607:f8b0:400e:c03::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0D0C5DB5; Sat, 29 Aug 2015 01:52:09 +0000 (UTC) (envelope-from yaneurabeya@gmail.com) Received: by padhm10 with SMTP id hm10so23518638pad.3; Fri, 28 Aug 2015 18:52:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=aWLl3WQXG7BlsZuuSGQnV3NTGbW6glcLa0Pv61tcpO0=; b=0XWgNiiOeBLBiAu6F6oTEd3c6VrofFgjxCzg+4z8yvs2xeM89TsVNTVg3J/VZOAftD SWglexlbgUs/1Ozd6HM0QSKSY/PILMKjvZeo9IhZAyHVpmEm44iDjvhDKS4+V13T7Dq9 q6zdFLI1YSPHLSY648DUlhHVJi4zFe9IT3GSwdwBXqPVREyrPoK8J+TUSIqfQCx240An JtYZR1KPXDuPnqiD/87iwHXDGhzHL9Dg2HtF0f7EEyHLAJBdII9VHz+FyS5oliWejW99 xjZFji7GJHil2cU4U5BdHFW7NWR0yNlbf+7fwSURZ7llKG6OJiNhRoAMMvCgmkYr0dZW LApw== X-Received: by 10.66.162.162 with SMTP id yb2mr20166918pab.122.1440813128057; Fri, 28 Aug 2015 18:52:08 -0700 (PDT) Received: from [21.139.114.193] ([172.56.32.129]) by smtp.gmail.com with ESMTPSA id u1sm7077431pbz.56.2015.08.28.18.52.07 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 28 Aug 2015 18:52:07 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: Network card interrupt handling From: Garrett Cooper X-Mailer: iPhone Mail (12H321) In-Reply-To: Date: Fri, 28 Aug 2015 18:52:06 -0700 Cc: John Baldwin , Sean Bruno , "freebsd-arch@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <00E4073A-9AF4-4FAD-8C09-B771C26A8319@gmail.com> References: <55DDE9B8.4080903@freebsd.org> <24017021.PxBoCiQKDJ@ralph.baldwin.cx> To: "K. Macy" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Aug 2015 01:52:09 -0000 > On Aug 28, 2015, at 18:25, K. Macy wrote: >=20 >> On Aug 28, 2015 12:59 PM, "John Baldwin" wrote: >>=20 >>> On Wednesday, August 26, 2015 09:30:48 AM Sean Bruno wrote: >>> We've been diagnosing what appeared to be out of order processing in >>> the network stack this week only to find out that the network card >>> driver was shoveling bits to us out of order (em). >>>=20 >>> This *seems* to be due to a design choice where the driver is allowed >>> to assert a "soft interrupt" to the h/w device while real interrupts >>> are disabled. This allows a fake "em_msix_rx" to be started *while* >>> "em_handle_que" is running from the taskqueue. We've isolated and >>> worked around this by setting our processing_limit in the driver to >>> -1. This means that *most* packet processing is now handled in the >>> MSI-X handler instead of being deferred. Some periodic interference >>> is still detectable via em_local_timer() which causes one of these >>> "fake" interrupt assertions in the normal, card is *not* hung case. >>>=20 >>> Both functions use identical code for a start. Both end up down >>> inside of em_rxeof() to process packets. Both drop the RX lock prior >>> to handing the data up the network stack. >>>=20 >>> This means that the em_handle_que running from the taskqueue will be >>> preempted. Dtrace confirms that this allows out of order processing >>> to occur at times and generates a lot of resets. >>>=20 >>> The reason I'm bringing this up on -arch and not on -net is that this >>> is a common design pattern in some of the Ethernet drivers. We've >>> done preliminary tests on a patch that moves *all* processing of RX >>> packets to the rx_task taskqueue, which means that em_handle_que is >>> now the only path to get packets processed. >>=20 >> It is only a common pattern in the Intel drivers. :-/ We (collectively) >> spent quite a while fixing this in ixgbe and igb. Longer (hopefully more= >> like medium) term I have an update to the interrupt API I want to push in= >> that allows drivers to manually schedule interrupt handlers using an >> 'hwi' API to replace the manual taskqueues. This also ensures that >> the handler that dequeues packets is only ever running in an ithread >> context and never concurrently. >=20 > Jeff has a generalization of the net_task infrastructure used at Nokia > called grouptaskq that I've used for iflib. That does essentially what you= > refer to. I've converted ixl and am currently about to test an ixgbe > conversion. I anticipate converting mlxen, all Intel drivers as well as th= e > remaining drivers with device specific code in netmap. The one catch is > finding someone who will publicly admit to owning re hardware so that I ca= n > buy it from him and test my changes. >=20 > Cheers. I have 2 re NICs in my fileserver at home (Asus went cheap on some of their M= Bs a while back), but the cards shouldn't cost more than $15 + shipping (loo= k for "Realtek 8169" on Google). HTH! -NGie=