Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Sep 2010 11:23:39 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        arch@freebsd.org
Subject:   Interrupt Threads
Message-ID:  <201009171123.39382.jhb@freebsd.org>

next in thread | raw e-mail | index | archive | help
I have wanted to rework some of the interrupt threads stuff and enable 
interrupt filters by default for a while.  I finally sat down and 
hacked out a new ithreads implementation at BSDCan and the following week.

The new ithreads stuff moves away from dedicated threads per handlers or irqs.  
Instead, it adopts a model more akin to what Solaris does (though probably not 
completely identical).  Each CPU has a queue of "pending handlers".  When an 
interrupt fires, all of the handlers for that interrupt are placed on to that 
CPU's queue.  There is a pool of hardware interrupt threads.  If the current 
CPU does not already have an active hardware interrupt thread, it grabs a free 
one from the pool, pins it to the current CPU, and schedules it.  The ithread 
continues to drain interrupt handlers from its CPU's queue until the queue is 
empty.  Once that happens it disassociates itself from the CPU and goes back 
into the free pool.  The effect is that interrupt handlers are now sort of 
like DPCs in Windows.

If an interrupt handler blocks on a turnstile and there are other handlers 
pending for this CPU, then the current ithread is divorced from the current 
CPU and a new ithread is allocated for the current CPU.

If we ever fail to allocate an ithread for a given CPU, then a flag is set.  
All ithreads check that flag before going idle, and if it is set they find the 
first CPU that needs an ithread and move to that CPU and start draining 
events.

The ithread pool can be dynamically resized at runtime via sysctl, but it 
can't be smaller than NCPU * 2 or larger than the total number of handlers.

Interrupt filters fit into this nicely since this avoids the problem with old 
interrupt filters that if you fix its design bug it may need to schedule 
multiple ithreads.  Now it still only schedules at most one ithread per 
interrupt.

To handle masking the interrupt and unmasking it when filters w/o handlers 
complete, I use a simple reference count with atomic ops to keep track of the 
number of queued handlers that need the interrupt masked and unmask it once 
the count drops to 0.

Software interrupts still use a dedicated ithread, but the queue of pending 
handlers lives in the ithread, not in the CPU.

I've also added some extensions to the current ithreads stuff based on some 
tricks that existing drivers use.  Specifically, an interrupt handler can now 
call hwi_sched() on itself to reschedule itself at the back of the current 
CPU's queue.  Thus, you can have NIC interrupt handlers do cooperative 
timesharing by just punting after N packets and using hwi_sched() to 
reschedule themselves.   I also added a new type of interrupt 
handler that is registered with INTR_MANUAL.  It is never automatically 
scheduled, but a filter can schedule it.

As a test, I've ported the igb(4) driver to this framework.  It uses 
hwi_sched() and an INTR_MANUAL handler for link events to replace almost all 
of the taskqueue usage in igb(4).  (The multiqueue transmit bits still need a 
task for one case, but all the interrupt handler stuff is now "simpler").

Some downsides to this approach include:

1) If you have two busy devices whose interrupts both go to the same CPU but 
via different IRQs, in the old model those threads could run concurrently on 
separate CPUs, but in the new model the handlers are tied to the same CPU and 
compete for CPU time on that CPU.  In other words, the new model really wants 
interrupts to be evenly distributed amongst CPUs to work properly.  Not 
entirely sure what I think about that.

2) Many folks find the ability to see how much CPU IRQ N's thread has used in 
top useful, but this loses all of that since there is no longer a tight 
coupling between IRQs and threads.

One unresolved issue is that the cardbus code currently uses a filter that 
returns just FILTER_SCHEDULE_THREAD without FILTER_HANDLED.  This is not 
supported in the new code.  I have some ideas on how to fix the cardbus code 
(most likely using wrappers around the child interrupt handlers) but need to 
has the details out with Warner.

A second unresolved issue is that interrupt storm detection is currently 
broken.  I have some thoughts on how to readd it, but it will likely be a bit 
tricky.

The code currently lives in p4 at //depot/user/jhb/intr/...  I have also put 
up a patch at http://www.freebsd.org/~jhb/patches/intr_threads.patch.  This 
patch includes the changes to the igb(4) driver.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201009171123.39382.jhb>