Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Aug 2015 18:25:53 -0700
From:      "K. Macy" <kmacy@freebsd.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-arch@freebsd.org, Sean Bruno <sbruno@freebsd.org>
Subject:   Re: Network card interrupt handling
Message-ID:  <CAHM0Q_N65J9OSaU=znjgJ_gEiu=M-cb9q1hrxskGSvYFhxL_NQ@mail.gmail.com>
In-Reply-To: <24017021.PxBoCiQKDJ@ralph.baldwin.cx>
References:  <55DDE9B8.4080903@freebsd.org> <24017021.PxBoCiQKDJ@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 28, 2015 12:59 PM, "John Baldwin" <jhb@freebsd.org> wrote:
>
> On Wednesday, August 26, 2015 09:30:48 AM Sean Bruno wrote:
> > We've been diagnosing what appeared to be out of order processing in
> > the network stack this week only to find out that the network card
> > driver was shoveling bits to us out of order (em).
> >
> > This *seems* to be due to a design choice where the driver is allowed
> > to assert a "soft interrupt" to the h/w device while real interrupts
> > are disabled.  This allows a fake "em_msix_rx" to be started *while*
> > "em_handle_que" is running from the taskqueue.  We've isolated and
> > worked around this by setting our processing_limit in the driver to
> > -1.  This means that *most* packet processing is now handled in the
> > MSI-X handler instead of being deferred.  Some periodic interference
> > is still detectable via em_local_timer() which causes one of these
> > "fake" interrupt assertions in the normal, card is *not* hung case.
> >
> > Both functions use identical code for a start.  Both end up down
> > inside of em_rxeof() to process packets.  Both drop the RX lock prior
> > to handing the data up the network stack.
> >
> > This means that the em_handle_que running from the taskqueue will be
> > preempted.  Dtrace confirms that this allows out of order processing
> > to occur at times and generates a lot of resets.
> >
> > The reason I'm bringing this up on -arch and not on -net is that this
> > is a common design pattern in some of the Ethernet drivers.  We've
> > done preliminary tests on a patch that moves *all* processing of RX
> > packets to the rx_task taskqueue, which means that em_handle_que is
> > now the only path to get packets processed.
>
> It is only a common pattern in the Intel drivers. :-/  We (collectively)
> spent quite a while fixing this in ixgbe and igb.  Longer (hopefully more
> like medium) term I have an update to the interrupt API I want to push in
> that allows drivers to manually schedule interrupt handlers using an
> 'hwi' API to replace the manual taskqueues.  This also ensures that
> the handler that dequeues packets is only ever running in an ithread
> context and never concurrently.
>

Jeff has a generalization of the net_task infrastructure used at Nokia
called grouptaskq that I've used for iflib. That does essentially what you
refer to. I've converted ixl and am currently about to test an ixgbe
conversion. I anticipate converting mlxen, all Intel drivers as well as the
remaining drivers with device specific code in netmap. The one catch is
finding someone who will publicly admit to owning re hardware so that I can
buy it from him and test my changes.

Cheers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHM0Q_N65J9OSaU=znjgJ_gEiu=M-cb9q1hrxskGSvYFhxL_NQ>