Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Aug 2015 10:38:36 -0700
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-arch@freebsd.org
Cc:        Sean Bruno <sbruno@freebsd.org>
Subject:   Re: Network card interrupt handling
Message-ID:  <24017021.PxBoCiQKDJ@ralph.baldwin.cx>
In-Reply-To: <55DDE9B8.4080903@freebsd.org>
References:  <55DDE9B8.4080903@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, August 26, 2015 09:30:48 AM Sean Bruno wrote:
> We've been diagnosing what appeared to be out of order processing in
> the network stack this week only to find out that the network card
> driver was shoveling bits to us out of order (em).
> 
> This *seems* to be due to a design choice where the driver is allowed
> to assert a "soft interrupt" to the h/w device while real interrupts
> are disabled.  This allows a fake "em_msix_rx" to be started *while*
> "em_handle_que" is running from the taskqueue.  We've isolated and
> worked around this by setting our processing_limit in the driver to
> -1.  This means that *most* packet processing is now handled in the
> MSI-X handler instead of being deferred.  Some periodic interference
> is still detectable via em_local_timer() which causes one of these
> "fake" interrupt assertions in the normal, card is *not* hung case.
> 
> Both functions use identical code for a start.  Both end up down
> inside of em_rxeof() to process packets.  Both drop the RX lock prior
> to handing the data up the network stack.
> 
> This means that the em_handle_que running from the taskqueue will be
> preempted.  Dtrace confirms that this allows out of order processing
> to occur at times and generates a lot of resets.
> 
> The reason I'm bringing this up on -arch and not on -net is that this
> is a common design pattern in some of the Ethernet drivers.  We've
> done preliminary tests on a patch that moves *all* processing of RX
> packets to the rx_task taskqueue, which means that em_handle_que is
> now the only path to get packets processed.

It is only a common pattern in the Intel drivers. :-/  We (collectively)
spent quite a while fixing this in ixgbe and igb.  Longer (hopefully more
like medium) term I have an update to the interrupt API I want to push in
that allows drivers to manually schedule interrupt handlers using an
'hwi' API to replace the manual taskqueues.  This also ensures that
the handler that dequeues packets is only ever running in an ithread
context and never concurrently.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?24017021.PxBoCiQKDJ>