Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Apr 1997 14:44:24 -0600
From:      Steve Passe <smp@csn.net>
To:        Peter Wemm <peter@spinner.dialix.com>
Cc:        cr@jcmax.com (Cyrus Rahman), smp@freebsd.org
Subject:   Re: SMP kernel deadlocks 
Message-ID:  <199704192044.OAA03856@Ilsa.StevesCafe.com>
In-Reply-To: Your message of "Sun, 20 Apr 1997 04:05:51 %2B0800." <199704192005.EAA01830@spinner.DIALix.COM> 

next in thread | previous in thread | raw e-mail | index | archive | help
Peter,

> Several comments..
> ...
> Third, the FIFO arrangement is being rather poorly used in the present
> code.  From memory again, there is a 2-deep fifo for each hardware priority
> "level" (level = vector / 16).  Since IPI's start at vector#24 (actually 
> ICU_OFFSET + 24, but that doesn't have any effect since ICU_OFFSET is a 
> multiple of 16).  This means that the irq 16 -> 23 (generally remapped PCI 
> irq's) are in the same "level" (hence fifo) as the IPI's.
> 
> I could imagine that it's possible that the PCI interrupts could fill the 
> fifo under heavy load..  That could also explain why I've not seen it 
> here, I have an EISA system that only has irq0->15, so the IPI's have the 
> fifo on that level to themselves.

this is correct according to what I can remember...

---
> Perhaps we could release the mplock while sending an IPI, and try to grab
> it back again before continuing...  Alternatively, have a timeout on the
> IPI, and if the apic hasn't recovered after a certain amount of time (ie:
> it's indefinately "busy"), then release the mplock for a moment and wait
> and check the status again before refetching the lock.  If it still fails 
> to recover, panic rather than hang forever..

I need to go back and read the intel app notes, etc. to determine the best
strategy.  I *think* that we could send the IPI, THEN release the mplock,
spin till its accepted into a fifo, reclaim the lock and continue.  Also
remember that the tlbflush IPI is itself incomplete in that it doesn't sync
with the other cpu(s), it merely requests the flush then continues without
waiting for it to actually occur.  When we re-write this IPI correctly the
exact method for handling the deadlock should become clearer...

The complete re-design of the INTerrupt sub-system can't be far off,  we
need to redistribute the hardware/software INTs thru-out the entire 256
vector range to properly utilize the APIC structure.  And that beast known as
vector.s should probably be tossed and redone from the ground up.

--
Steve Passe	| powered by
smp@csn.net	|            Symmetric MultiProcessor FreeBSD





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704192044.OAA03856>