From owner-freebsd-arch@freebsd.org Sat Aug 29 01:25:55 2015 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2994F9C4B1B for ; Sat, 29 Aug 2015 01:25:55 +0000 (UTC) (envelope-from kmacybsd@gmail.com) Received: from mail-ig0-x22b.google.com (mail-ig0-x22b.google.com [IPv6:2607:f8b0:4001:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 00B79132; Sat, 29 Aug 2015 01:25:54 +0000 (UTC) (envelope-from kmacybsd@gmail.com) Received: by igph8 with SMTP id h8so23057354igp.0; Fri, 28 Aug 2015 18:25:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=faa25Yy0L1mM5x85u7lGzAoifVH9ZL5nEGFizINtNbE=; b=unGMNEXxxQlsw2is+cwS6/3wPoL8V0Zp4YKxUT+urhQVAmKMeN5kZwph/LAVHE85gE fwU1AbG9n1BP9ZM/iXbazMWvT56Xeooww+MHcjb1L4lw01PbcoJHsIKTAUcomjtCFjjM 314uRmXXVxkvuNCp+wJgYLwdBgaay0QdJjlJSnvZM7+FYBc2AUNipuuTlB6Eum14tfO1 SgXCLmbOGfF5FdI+q39iE4JI3IBCHQ+7MbST2yc3YAhSWkOw0841Ig/V8xlw1zx+1u5L DV/xAEtz8TWjZ+WYVmBHKKye6GXPBFq+5HYcJ4WxRM33ZBkMjE3dtMr9BpMz7xJIikV8 Mp9g== MIME-Version: 1.0 X-Received: by 10.50.50.129 with SMTP id c1mr6082172igo.60.1440811554161; Fri, 28 Aug 2015 18:25:54 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.36.30.202 with HTTP; Fri, 28 Aug 2015 18:25:53 -0700 (PDT) Received: by 10.36.30.202 with HTTP; Fri, 28 Aug 2015 18:25:53 -0700 (PDT) In-Reply-To: <24017021.PxBoCiQKDJ@ralph.baldwin.cx> References: <55DDE9B8.4080903@freebsd.org> <24017021.PxBoCiQKDJ@ralph.baldwin.cx> Date: Fri, 28 Aug 2015 18:25:53 -0700 X-Google-Sender-Auth: g06A6_w0-31zkSCoWOeCfnvvZco Message-ID: Subject: Re: Network card interrupt handling From: "K. Macy" To: John Baldwin Cc: freebsd-arch@freebsd.org, Sean Bruno Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Aug 2015 01:25:55 -0000 On Aug 28, 2015 12:59 PM, "John Baldwin" wrote: > > On Wednesday, August 26, 2015 09:30:48 AM Sean Bruno wrote: > > We've been diagnosing what appeared to be out of order processing in > > the network stack this week only to find out that the network card > > driver was shoveling bits to us out of order (em). > > > > This *seems* to be due to a design choice where the driver is allowed > > to assert a "soft interrupt" to the h/w device while real interrupts > > are disabled. This allows a fake "em_msix_rx" to be started *while* > > "em_handle_que" is running from the taskqueue. We've isolated and > > worked around this by setting our processing_limit in the driver to > > -1. This means that *most* packet processing is now handled in the > > MSI-X handler instead of being deferred. Some periodic interference > > is still detectable via em_local_timer() which causes one of these > > "fake" interrupt assertions in the normal, card is *not* hung case. > > > > Both functions use identical code for a start. Both end up down > > inside of em_rxeof() to process packets. Both drop the RX lock prior > > to handing the data up the network stack. > > > > This means that the em_handle_que running from the taskqueue will be > > preempted. Dtrace confirms that this allows out of order processing > > to occur at times and generates a lot of resets. > > > > The reason I'm bringing this up on -arch and not on -net is that this > > is a common design pattern in some of the Ethernet drivers. We've > > done preliminary tests on a patch that moves *all* processing of RX > > packets to the rx_task taskqueue, which means that em_handle_que is > > now the only path to get packets processed. > > It is only a common pattern in the Intel drivers. :-/ We (collectively) > spent quite a while fixing this in ixgbe and igb. Longer (hopefully more > like medium) term I have an update to the interrupt API I want to push in > that allows drivers to manually schedule interrupt handlers using an > 'hwi' API to replace the manual taskqueues. This also ensures that > the handler that dequeues packets is only ever running in an ithread > context and never concurrently. > Jeff has a generalization of the net_task infrastructure used at Nokia called grouptaskq that I've used for iflib. That does essentially what you refer to. I've converted ixl and am currently about to test an ixgbe conversion. I anticipate converting mlxen, all Intel drivers as well as the remaining drivers with device specific code in netmap. The one catch is finding someone who will publicly admit to owning re hardware so that I can buy it from him and test my changes. Cheers.