Date: Sat, 19 Apr 1997 14:44:24 -0600 From: Steve Passe <smp@csn.net> To: Peter Wemm <peter@spinner.dialix.com> Cc: cr@jcmax.com (Cyrus Rahman), smp@freebsd.org Subject: Re: SMP kernel deadlocks Message-ID: <199704192044.OAA03856@Ilsa.StevesCafe.com> In-Reply-To: Your message of "Sun, 20 Apr 1997 04:05:51 %2B0800." <199704192005.EAA01830@spinner.DIALix.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
Peter, > Several comments.. > ... > Third, the FIFO arrangement is being rather poorly used in the present > code. From memory again, there is a 2-deep fifo for each hardware priority > "level" (level = vector / 16). Since IPI's start at vector#24 (actually > ICU_OFFSET + 24, but that doesn't have any effect since ICU_OFFSET is a > multiple of 16). This means that the irq 16 -> 23 (generally remapped PCI > irq's) are in the same "level" (hence fifo) as the IPI's. > > I could imagine that it's possible that the PCI interrupts could fill the > fifo under heavy load.. That could also explain why I've not seen it > here, I have an EISA system that only has irq0->15, so the IPI's have the > fifo on that level to themselves. this is correct according to what I can remember... --- > Perhaps we could release the mplock while sending an IPI, and try to grab > it back again before continuing... Alternatively, have a timeout on the > IPI, and if the apic hasn't recovered after a certain amount of time (ie: > it's indefinately "busy"), then release the mplock for a moment and wait > and check the status again before refetching the lock. If it still fails > to recover, panic rather than hang forever.. I need to go back and read the intel app notes, etc. to determine the best strategy. I *think* that we could send the IPI, THEN release the mplock, spin till its accepted into a fifo, reclaim the lock and continue. Also remember that the tlbflush IPI is itself incomplete in that it doesn't sync with the other cpu(s), it merely requests the flush then continues without waiting for it to actually occur. When we re-write this IPI correctly the exact method for handling the deadlock should become clearer... The complete re-design of the INTerrupt sub-system can't be far off, we need to redistribute the hardware/software INTs thru-out the entire 256 vector range to properly utilize the APIC structure. And that beast known as vector.s should probably be tossed and redone from the ground up. -- Steve Passe | powered by smp@csn.net | Symmetric MultiProcessor FreeBSD
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704192044.OAA03856>