Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Apr 1997 04:05:51 +0800
From:      Peter Wemm <peter@spinner.dialix.com>
To:        cr@jcmax.com (Cyrus Rahman)
Cc:        smp@csn.net, smp@freebsd.org
Subject:   Re: SMP kernel deadlocks 
Message-ID:  <199704192005.EAA01830@spinner.DIALix.COM>
In-Reply-To: Your message of "Sat, 19 Apr 1997 09:20:39 -0400." <9704191320.AA18511@corona.jcmax.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Cyrus Rahman wrote:
[...steves summary...]
>  reason:
>         cpu0 is trying to service an INT, spin-locks attempting to get the
>         mp_lock, which evidently is permanently held by some process on cpu1.
>         the lock count that is being held is usually 2, but sometimes only 1.

> The details:
> 
> 	During the page fault, it generally happens that at some point
> 	smp_invltlb() gets called to flush the TLB on the other CPU's.
> 	smp_invltlb() calls allButSelfIPI() and sends an IPI to the other
> 	processor, which, unfortunately, is sometimes already processing an
> 	interrupt of a higher priority.  This interrupt routine now spends
> 	its time trying to obtain the mp_lock spin lock so it can enter the
> 	kernel, but the processor which has this lock is also in a spin loop
> 	in apicIPI() waiting for the IPI to be delivered.

Several comments..

First, the IPI's have highest priority from memory..  There shouldn't be 
any cases where ipi receival is blocked by hardware interrupt priority 
ordering (I think.. my memory is pretty rough)

Second, IPI's are not maskable at the moment.. not even a splhigh()..  
However, a 'cli' would do it I guess.

Third, the FIFO arrangement is being rather poorly used in the present
code.  From memory again, there is a 2-deep fifo for each hardware priority
"level" (level = vector / 16).  Since IPI's start at vector#24 (actually 
ICU_OFFSET + 24, but that doesn't have any effect since ICU_OFFSET is a 
multiple of 16).  This means that the irq 16 -> 23 (generally remapped PCI 
irq's) are in the same "level" (hence fifo) as the IPI's.

I could imagine that it's possible that the PCI interrupts could fill the 
fifo under heavy load..  That could also explain why I've not seen it 
here, I have an EISA system that only has irq0->15, so the IPI's have the 
fifo on that level to themselves.

Perhaps we could release the mplock while sending an IPI, and try to grab
it back again before continuing...  Alternatively, have a timeout on the
IPI, and if the apic hasn't recovered after a certain amount of time (ie:
it's indefinately "busy"), then release the mplock for a moment and wait
and check the status again before refetching the lock.  If it still fails 
to recover, panic rather than hang forever..

Cheers,
-Peter





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199704192005.EAA01830>